Graphs
Basic Terminology
A graph is a nonlinear data structure. It is a collection of nodes that have data and are connected to other
nodes.
A graph G consists of two things:
1. A set V of elements called nodes (or points or vertices)
2. A set E of edges such that each edge e in E is identified with a unique (unordered) pair [u, v] of nodes in
V, denoted by e = [u, v]
In the above Graph, the set of vertices V = {0,1,2,3,4} and the set of edges E = {01, 12, 23, 34, 04, 14, 13}.
Suppose e= (u, v) then the nodes u and v are called the endpoints of e. u and v are said be adjacent nodes or
neighbors.
• The degree of node u, written as deg(u), is the number of edges containing u. if deg(u) = 0 that is , if u does
not belong to any edge, then u is called an isolated node.
• A path p of length n from a node u to a node v is defined as a sequence of n + 1 nodes.
P = (v0, v1,v2,……….vn)
Such that u =v0, vi-1 is adjacent to vi for I = 1,2,…….,n, and vn=v.
• The path P is said to be closed if v0 = vn.
• The path P is said to be simple if all the nodes are distinct, with the exception that v0 may equal to vn.
A cycle is a closed simple path with length 3 or more.
• Connected Graph: A connected graph is the one in which some path exists between every two vertices
(u, v) in V. There are no isolated nodes in connected graph.
• A connected graph without any cycles is called tree graph or free tree or simply tree.
• Complete Graph
A complete graph is the one in which every node is connected with all other nodes. A complete graph contain
n(n-1)/2 edges where n is the number of nodes in the graph.
• Weighted Graph
A graph G= (V, E) is called a labeled or weighted graph because each edge has a value or weight
representing the cost of traversing that edge.
• Multiple edges: distinct edges e and e’ are called multiple edges if they connect the same endpoints,
that is, if e= [u, v] and e’=[v, u].
• Loops: an edge e is called a loop if it has identical endpoints, that is, if e= [u, u].
• Multi Graph: If there are numerous edges between a pair of vertices in a graph G= (V, E), the graph is
referred to as a multigraph.
• Directed Graph: A directed graph also referred to as a digraph, is a set of nodes connected by edges,
each with a direction, in other words , each edge e is identified with an ordered pair(u, v) of nodes in G
rather than an unordered pair [u, v].
Suppose G is directed graph with a directed edge e =(u, v). Then e is also called an arc, moreover
following terminology is used.
• e begins at u and ends at v.
• u is the origin or initial point of e, and v is the destination or terminal point of e.
• u is a predecessor of v, and v is a successor or neighbor of u.
• u is adjacent to v, v is adjacent to v.
• The outdegree of a node u in G, written outdeg(u), is the number of edges beginning at u, similarly,
the indegree of u, written indeg(u),cis the number of edges ending at u.
• A node u is called source if it has positive outdegree but zero indegree. Similarly, u is called a sink if
it has a zero outdegree but a positive indegree.
• A directed graph is said to be connected, or strongly connected, if for each pair u, v of nodes in G
there is a path from u to v and there is also a path from v to u. Graph G is said to be unilaterally
connected if for any pair u, v of nodes in G there is a path from u to v or a path from v to u.
There are two standard ways of maintaining a graph G in memory of a computer. One way called the
sequential representation of G, is by means of its Adjacency Matrix A. the other way called the linked
representation of G, is by means of linked list of neighbors.
1. Adjacency Matrix
Adjacency Matrix is a 2D array of size V x V where V is the number of vertices in a graph. Let the 2D
array be adj[][], a slot adj[i][j] = 1 indicates that there is an edge from vertex i to vertex j. Adjacency
matrix for undirected graph is always symmetric. Adjacency Matrix is also used to represent weighted
graphs. If adj[i][j] = w, then there is an edge from vertex i to vertex j with weight w.
Pros: Representation is easier to implement and follow. Removing an edge takes O(1) time. Queries
like whether there is an edge from vertex ‘u’ to vertex ‘v’ are efficient and can be done O(1).
Cons: Consumes more space O(V^2). Even if the graph is sparse(contains less number of edges), it
consumes the same space.
Fig: Graph Fig: Adjacency matrix
• Weighted Undirected Graph Representation
Weight or cost is indicated at the graph's edge, a weighted graph representing these values in the
matrix.
Adjacency List:
• An adjacency list is maintained for each node present in the graph which stores the node value and
a pointer to the next adjacent node to the respective node. If all the adjacent nodes are traversed
then store the NULL in the pointer field of last node of the list.
• The sum of the lengths of adjacency lists is equal to the twice of the number of edges present in an
undirected graph.
• In a directed graph, the sum of lengths of all the adjacency lists is equal to the number of edges
present in the graph.
• In the case of weighted directed graph, each node contains an extra field that is called the weight of
the node.
• An adjacency list is efficient in terms of storage because we only need to store the values for the
edges. For a graph with millions of vertices, this can mean a lot of saved space. In adjacency list it
is easy to add new nodes.
Fig: Adjacency List
Fig: Graph
Adjacency Multi-list representation
• Modified version of adjacency lists
• Edge based rather than vertex based representation of graph
GRAPH TRAVERSAL
• Graph traversal is the problem of visiting all the nodes in a graph in a particular
manner, updating and/or checking their values along the way. The order in which
the vertices are visited may be important, and may depend upon the particular
algorithm.
The two common traversals:
• breadth-first search
• depth-first search
• Breadth-First Search
• In a breadth-first search, we begin by visiting the start vertex v. Next all un visited vertices adjacent to v
are visited. Unvisited vertices adjacent to these newly visited vertices are then visited and so on.
• Depth-First Search
• We begin by visiting the start vertex v. Next an unvisited vertex w adjacent to’ v is selected, and a
depth-first search from w is initiated. When a vertex u is reached such that all its adjacent vertices have
been visited, we back up to the last vertex visited that
has an unvisited vertex w adjacent to it and initiate a depth-first search from w. The search terminates
when no unvisited vertex can be reached from any of the visited vertices.
• BFS:
• Algorithm
Step 1: Set STATUS = 1 (ready state) for each node in G
Step 2: Enqueue the starting node A and set its STATUS = 2(waiting state)
Step 3: Repeat Steps 4 and 5 until QUEUE is empty
Step 4: Dequeue a node N. Process it and set its STATUS = 3 (processed state).
Step 5: Enqueue all the neighbours of N that are in the ready state (whose STATUS = 1) and set their
STATUS = 2 (waiting state)
[END OF LOOP]
Step 6: EXIT
• BFS:
• Algorithm
1. Put root in a Queue
2. Repeat until Queue is empty:
Dequeue a node
Process it
Add it’s children to queue(if not in queue or not yet visited)
• Add A to QUEUE1 and NULL to QUEUE2.
QUEUE1 = {A} QUEUE2 = {NULL}
• 2. Delete the Node A from QUEUE1 and insert all its neighbours. Insert Node A into QUEUE2
QUEUE1 = {B, D} QUEUE2 = {A}
• 3. Delete the node B from QUEUE1 and insert all its neighbours. Insert node B into QUEUE2.
QUEUE1 = {D, C, F} QUEUE2 = {A, B}
• 4. Delete the node D from QUEUE1 and insert all its neighbours. Since F is the only neighbour of it which has
been inserted, we will not insert it again. Insert node D into QUEUE2.
QUEUE1 = {C, F} QUEUE2 = { A, B, D}
• 5. Delete the node C from QUEUE1 and insert all its neighbours. Add node C to QUEUE2.
QUEUE1 = {F, E, G} QUEUE2 = {A, B, D, C}
• 6. Remove F from QUEUE1 and add all its neighbours. Since all of its neighbours has already been added, we
will not add them again. Add node F to QUEUE2.
QUEUE1 = {E, G} QUEUE2 = {A, B, D, C, F}
• 7. Remove E from QUEUE1, all of E's neighbours has already been added to QUEUE1 therefore we will not add
them again.
QUEUE1 = {G} QUEUE2 = {A, B, D, C, F, E}
• 8. Remove G from queue, all of G’s neighbours has already been processed we will not add it again. All the
nodes are visited
• QUEUE2={A,B,D,C,F,E,G}
• DFS:
• Algorithm
Step 1: SET STATUS = 1 (ready state) for each node in G
Step 2: Push the starting node A on the stack and set its STATUS = 2 (waiting state)
Step 3: Repeat Steps 4 and 5 until STACK is empty
Step 4: Pop the top node N. Process it and set its STATUS = 3 (processed state)
Step 5: Push on the stack all the neighbours of N that are in the ready state (whose STATUS = 1)
and set their STATUS = 2 (waiting state)
[END OF LOOP]
Step 6: EXIT
• DFS:
• Algorithm
Push the starting node A on the stack
Repeat Steps until STACK is empty
Pop the top node N.
Process it
Push on the stack all the neighbours of N(unvisited/ not in stack)
• Push H onto the stack. STACK : H
• POP the top element of the stack i.e. H, print it and push all the neighbours of H onto the stack that are is ready state.
Print H STACK : A
• Pop the top element of the stack i.e. A, print it and push all the neighbours of A onto the stack that are in ready state.
Print A STACK: B, D
• Pop the top element of the stack i.e. D, print it and push all the neighbours of D onto the stack that are in ready state.
Print D STACK: B, F
• Pop the top element of the stack i.e. F, print it and push all the neighbours of F onto the stack that are in ready state.
Print F Stack : B
• Pop the top element of the stack i.e. B, print it and push all the neighbours of B onto the stack that are in ready state.
Print B STACK: C
• Pop the top of the stack i.e. C and push all the neighbours.
Print C STACK: E, G
• Pop the top of the stack i.e. G and push all its neighbours.
Print G STACK: E
• Pop the top of the stack i.e. E and push all its neighbours.
Print E STACK:
• Hence, the stack now becomes empty and all the nodes of the graph have been traversed.
• The printing sequence of the graph will be :
H -> A -> D -> F -> B -> C -> G -> E
SPANNING TREE :
• A spanning tree of a graph, G, is a set of |V|-1 edges that connect all vertices of the graph. Thus a
minimum spanning tree for G is a graph, T = (V’, E’) with the following properties:
• V’ = V
• T is connected
• T is acyclic.
Minimum Spanning Tree
• In general, it is possible to construct multiple spanning trees for a graph, G. If a cost, Cij, is
associated with each edge, Eij = (Vi,Vj), then the minimum spanning tree is the set of edges, Espan,
forming a spanning tree, such that:
• C = sum( Cij | all Eij in Espan) is a minimum. Eg. The graph has 16 spanning trees.
Minimum Spanning Tree
Properties
• Removing an edge from the spanning tree will make it disconnected.
• Adding one edge will create a cycle.
• If each edge has distinct weight then there will be only one and unique minimum cost spanning tree.
• A complete undirected graph can have n^(n-2) number of spanning tree.
• Disconnected graph dose not have any spanning tree.
• From a complete graph by removing max (e-n+1) edges we construct a spanning tree.
For above complete graph n =4, e =6 max(6-4+1)= 3
• Kruskal’s algorithm: An approach to determine minimum cost spanning tree of a graph has been
given by Kruskal. Here a minimum cost spanning tree T, is built edge by edge. Edges are considered
for inclusion in T, in non-decreasing order of their costs. An edge is included in T if it dose not form
a cycle with the edges already in T.
• Since G is connected and has n>0 vertices, exactly (n -1) edge will be selected for inclusion in T.
• Algorithm
1. Sort the edges by weight in ascending order.
2. Select the lowest cost edge from the list. Remove this edge from list and add this edge to the tree. If
the addition results in a cycle discard this edge.
3. Stop when (n-1) edges have been added to the tree.
Edge AB DE BC CD AE AC AD
Weight 1 2 3 4 5 7 10
4
• Prim's Algorithm is used to find the minimum spanning tree from a graph. Prim's algorithm finds the
subset of edges that includes every vertex of the graph such that the sum of the weights of the edges
can be minimized.
• Prim's algorithm starts with the single node and explore all the adjacent nodes with all the
connecting edges at every step. The edges with the minimal weights causing no cycles in the graph
got selected.
• Prim’s algorithm
1. Choose an arbitrary vertex Vi and add it to the tree.
2. Select the lowest cost edge that connects vertex Vi to another vertex Vj without forming any
cycle.
3. Stop when n-1 edges have been added to the tree.
Shortest Path Algorithm