Unit 1 and 2
Unit 1 and 2
INTRODUCTION
Algorithm: It is a sequence of unambiguous (no repeated) instructions for solving a
problem. i.e., for obtaining a required output for any legitimate (valid) input in a finite
amount of time.
It is one of the most basic tools used to develop the problem solving logic.
It will not depend on any assumptions.
Features of an Algorithm (or) characteristics:
1. Finiteness: Algorithm must be terminated with finite number of steps. (Finiteness)
2. No Ambiguous: no ambiguity (No Repetition) in any instruction.
3. General Solution: It must provide the general solution for any input related to the
problem.
4. Order of execution: It must define the order of execution of the instruction.
5. Definiteness: Algorithm must be precise (Exact solution) (Definiteness)
6. Algorithm must be step by step procedure.
7. It should be written only in normal English.
8. Valid Inputs: Valid inputs must be specified.(Inputs)
9. Correct output must be produced form the given valid input. Atleast one quantity is
produced. (Output)
10. Effectiveness: Each instruction must be basic and simple. (Effectiveness)
Quality of Good algorithm: The following parameters are used to evaluate the quality
of good algorithm.
1. Time efficiency
2. Space Efficiency
3. Accuracy
4. Sequence
5. Generality – The algorithm must give the general solution.
Notion of Algorithm:
2|Page
TIME COMPLEXITY
Goal: It is used to find the best algorithm for solving the given problem
Efficiency Evaluation: The efficiency of an algorithm to be evaluated in two ways.
1. Time efficiency – how fast the algorithm works
2. Space Efficiency – it deals the extra space the algorithm requires
Units for measuring running time of an algorithm: [ T(n) ]
Methods for calculating running time of an algorithm:
1. Using the computer to measure the time efficiency
2. Frequency count
3. Basic operation
1. Using computer Speed: to measure the running time of program implementing
algorithm for input size n.
Drawback:
a. It depends on the RAM spped.
b. It depends on the quality of the program used for implementation
c. Need extra time for implementation.
2. Frequency Count: To evaluate the number of times each and every instructions in a
algorithm is executed.
Drawback:
a. The algorithm is too complicated then very difficult to calculate its efficiency.
3. Basic operation: Measure the number of times the basic operation is executed.
(Basic Operation- The operation is used to solve the problem)
T(n) = Counting the number of times the basic operation is executed.
General expression for calculating time complexity or running time of an
algorithm:
T(n) ≈ Cop * C(n)
Where, C(n) – Number of times the basic operation is executed.
Cop - Time needed to execute basic operation in one time.
3|Page
BASIC EFFICIENCY CLASSES
SPACE COMPLEXITY
Definition: The amount of memory consumed while executing an algorithm with
respect to the given input size.
General Expression to calculate the space complexity:
Space complexity = Code space + Input size space + Auxiliary space
Code Space: The amount of memory is used to store the code segment or program
segment in RAM while executing an algorithm. (Object code or binary code or
executable code).
- It‟s very difficult to predict during design phase of SDLC. So it is not considered as
an important factor.
Auxiliary Space: The extra space or temporary spaces are used by an algorithm except
the input size.
Input Size Space: the amount of memory used for storing the inputs.
Expression to calculate the space complexity:
Space complexity ≈ Auxiliary space
4|Page
ASYMPTOTIC NOTATIONS
Purpose of Asymptotic Notations:
It is used to compare and ranking the order of growth of an algorithms.
It defines the mathematical notations used to describe the running time of an
algorithm.
It is used to write fastest and slowest possible running time for an algorithm. (best
case and worst case scenario)
It is a mathematical tool for represent the complexity of an algorithm.
Complexity of an algorithm expressed based on input size.
Need for Asymptotic Notation:
1. It defines the characteristics of algorithms efficiency.
2. It is used to compare the performances of various algorithms.
Types of Asymptotic Notations:
a. Big-O Notation (O-notation)- Worst case complexity- Maximum time
required for program execution.
b. Omega Notation (Ω-notation) – Best case complexity- Minimum time
required for program execution.
c. Theta Notation (Θ-notation) – Average case complexity- Average time
required for program execution.
Big-O Notation (O-notation) - Worst case complexity
o Definition: A function t(n) is said to be in O(g(n)), denoted by t(n)€ O(g(n)), if
t(n) is bounded above by some constant multiple of g(n) for all large n. i.e., if
there existsome positive constant c and some non-negative integer n0 such that (C-
Order of growth and n0 – Break Event Point)
t(n) ≤ C*g(n) for all n≥n0
5|Page
o Representation:
6|Page
o Representation:
o Notes;
a. Use string formula for n! ≈ sqrt(2∏n) (n/e)n for large value of n.
b. If the value of numerator and denominator is ∞ then apply L-Hospital rule.
(Differentiate separately with respect to n)
o Examples:
1. Compare the order of growth n and n2.
2. Compare the order of growth log n and sqrt(n)
3. Compare the order the growth n! and 2n.
4. Compare the order of growth n*(n-1)/2 and n2.
7|Page
Properties:
1. General Properties:If f(n) is O(g(n)) then a*f(n) is also O(g(n)), where a is a
constant.
Example: f(n) = 2n²+5 is O(n²) then, 7*f(n) = 7(2n²+5) = 14n²+35 is also O(n²).
2. Transitive Properties:If f(n) is O(g(n)) and g(n) is O(h(n)) then f(n) =O(h(n)).
Example: f(n) = n, g(n) = n² and h(n)=n³, n is O(n²) and n² is O(n³) then, n is
O(n³)
3. Symmetric Properties: If f(n) is Θ(g(n)) then g(n) is Θ(f(n)).
Example:f(n) = n² and g(n) = n², then, f(n) = Θ(n²) and g(n) = Θ(n²)
4. If f(n) = O(g(n)) and d(n)=O(e(n)) then f(n) + d(n) = O( max( g(n), e(n) ))
Example: f(n) = n i.e O(n) , d(n) = n² i.e O(n²) , then f(n) + d(n) = n + n² i.e O(n²)
5. If f(n)=O(g(n)) and d(n)=O(e(n)) then f(n) * d(n) = O( g(n) * e(n))
Example: f(n) = n i.e O(n), d(n) = n² i.e O(n²), then f(n) * d(n) = n * n² = n³ i.e O(n³)
Lower Bound
8|Page
Linear Search (or) Sequential Search
Aim (or) Objective (or) Goal: Find the search element in the given list of elements
using linear Search.
Input:
1. Number of elements (n)
2. List of Elements (array – a[])
3. Search element (se)
Output: Element exist or not exist. (if exist return 1 else return -1)
Algorithm Design Technique: Brute Force Method
Concept of Linear Search Algorithm:
1) Scanning the given list of elements from left to right (one element at a time)
2) Matching the scanning element with the search element. If element found stop the
scanning process and return 1.
3) If the element is not found in the list, the search operation to be terminated at the
end and return -1.
Basic Operation: Comparison (= =)
Non – Recursive Algorithm for Linear Search:
Algorithm linear_search(a[], n, se)
//Problem Description: Search the search element in a list
//Input: List of elements, size n and search element se
//Output: return 1 if exist and return -1 if not exist
for(i=0 ; i<n; i++)
{
if(se==a[i])
break;
}
if(i<n)
return 1;
else
return -1
9|Page
Algorithm Analysis:
a. Best Case:The search element is exist as the first element in the list
The basic operation is executed only one time.
T(n) ≈ Ω(1)
b. Worst Case:The search element may be found at the end of the list or may not exist
in the list.
Number of times the basic operation is executed
= n
if the element is found at the last
Number of times the basic operation is executed
= n
if the element is not found
Therefore, the worst case complexity of linear search is defined as,
T(n) ≈ O(n)
c. Average Case:The search element is found in any location or index of the given list
T(n) =(n+1)/2 ≈ ɵ (n)
Advantages of Linear Search (or) Strength:
a. Very simple to implement
b. It can be used in both 1D and 2D list
c. It performs faster when the size of the is small
d. It is not affected by any insertions and deletions in the list (logic won‟t be
changed)
e. No need to sort the list of elements
Disadvantages:
a. It is very slow while searching in large set or list (time complexity is high) – not
efficient in large data set.
10 | P a g e
Binary Search
o Aim (or) Objective: the main objective is to search a particular element in a given
list.
o Need for Binary Search: (Drawback of Linear Search)
a. It is very slow while searching in large set or list (time complexity is high) – not
efficient in large data set.
o Input: the algorithm must need to three inputs. They are,
a. Total number of elements in the list
b. Elements in the list
c. Search element
o Output: the outcome of this algorithm is to return the element is exists in the list or
not.
o Condition for binary search: The input list must be in sorting order (either in
ascending or descending order).
o Algorithm Design Technique: Divide and Conquer
o Basic operation of binary search: Comparison (==)
o Working Principle or Concept:
a. The search element is compared with array’s middle element. If it match the
algorithm stops. (A[mid] ==search)
b. Otherwise, check whether the search element is less than the array‟s middle
element of an array. If it like this then the left list is considered for search from the
starting index to index of mid-position -1.
i.e., (A[mid] > Search) end = mid-1)
c. If the above two conditions are not satisfied then the right list is considered for
search with starting index as mid-position+1 to last index of array.
i.e., (A[mid] < Search) start = mid+1)
Note: Here the problem size is reduced by half of its original size.
(Here only the part of the list is considered for the process if the search element is
not matched with mid-position value)
11 | P a g e
o Non – recursive Algorithm for Binary Search:
Algorithm binary_search(a[], n, se)
//Problem Description: It is used to search for a particular element in a list
//Input: List of elements (a) , size (n) and search element (se)
//Output: Display exist or not
low=0
high=n-1
while(low<=high)
{
mid = (low+high)/2
if(a[mid]==se)
break;
else if(a[mid]>se)
high=mid-1;
else
low=mid-1;
}
if(low<high)
return mid
else
return -1
o Recursive Algorithm for Binary Search:
Algorithm binary_search(a[], n, se)
//Problem Description: It is used to search for a particular element in a list
//Input: List of elements (a) , size (n) and search element (se)
//Output: Display exist or not
low=0
high=n-1
if(low<=high)
{
mid = (low+high)/2
if(a[mid]==se)
return mid;
else if(a[mid]>se)
12 | P a g e
return binary_search(a, low, mid-1);
else
return binary_search (a, mid+1, high)
}
return -1
o Analysis of Binary Search: (time Complexity)
T(n) = T(n/2)+1, T(1) =1
Solving, the above recurrence relation,
T(n) = log2n +1 ≈ O(log2n)
o Advantages of Binary Search:
a. Takes less time for searching element in the sorted list
o Disadvantages of Binary Search:
a. Records must be either in ascending or descending order.
o Applications:
a. It is efficient searching method used to search in database records
b. Solving of non-linear equations with one unknown variable is used.
13 | P a g e
Interpolation Search (Variant of Binary Search)
Goal of Interpolation Search: It reduces the searching time of binary Search.
Concept: In binary search, the searching will take places in the middle position
irrespective of the search element.
o The searching will takes according to the value of the key being searched.
o Last element: If the search element is closer to the last element, interpolation search
is likely to start search towards the end side.
o The complexities for searching the element in the corner of the list are reduced.
Condition: Input list of elements must be sorted order.
Formula to find the location (K):(applicable only when the list contains more than one
element)
K = low + (high – low) / (arr[high] – arr[low]) *(search_data – arr[low])
Algorithm:
ALGORITHM interpolation(a[], n, search)
//Problem Description: Find the search element exist in the list or not.
//Input: List of elements (a), number of element (n) and search element (se)
//Output: Exit or not exist
low=0
high=n-1
while(low<=high && se>=a[low] && se<=a[high])
{
p = low + (high-low)/(a[high]-a[low]) * (se-a[low])
if( a[p]==se)
return p;
else if(a[p] < se)
low =p+1
else
high = p-1
}
return -1
Time Complexity of Interpolation Search: Average Case : O(log2(log2 n)) and Worst
Case : O(n)
Space Complexity: O(1)
14 | P a g e
Pattern Matching Algorithm
Goal: check whether the pattern exist in the given string or text or document.
Input: Text string and pattern string
Output: Pattern exist or not exist
Condition: the length of the pattern (m) must be less than or equal to the length of the
given string (or) text (n). i.e., m≤n.
Basic Operation : Comparison (= =)
Different Types of Algorithm:
1. Naïve string matching algorithm
2. Rabin - Karp Algorithm
3. Knuth – Morris - Pratt Algorithm
15 | P a g e
Rabin – Karpp Algorithm
Algorithm RK_matching(T[0…n-1], p[0…m-1])
//Problem description: Search the substring in the given text
//Input: Input text (or) string and pattern
//Output: return the index if exist otherwise display the pattern is not exist
n = strlen(t); /* Find the Length of the String and
m=strlen(p); Pattern */
if(n>=m) /* Calculate the initial hash value
{
for pattern and String */
h=0; h1=0;
for(i=m-1,po=1; i>=0;i--,po=po*10)
{
h=h+(p[i]+1)*po;
h1=h1+(t[i]+1)*po;
}
po=po/10;
for(i=0; i<n-m+1;i++) /* Hash value matched. So the pattern
{ has been checked about the matches
if(h%13 == h1%13)
*/
{
for(j=0;j<m;j++)
{
if(p[j]!=t[i+j])
break;
}
if(j==m)
{
printf("\nPattern exist");
break;
}
}
else /* Not matched the hash value. The hash value
h1=(h1-(t[i]+1)*po)*10 + of the text has been updated with the next
(t[i+m]+1); value */
}
}
else
printf("Pattern not exist");
16 | P a g e
Knuth – Moris – Pratt algorithm
o Goal: reduce the searching time of pattern matching algorithm. (linear running
time)
o Compute the prefix_matrix for pattern
m = Length(pattern)
Initialize pref[0] = 0
k=0
for(q=2; q<=m ; q++)
{
while(k>0 && p[k+1] != p[q])
k = pref[k]
if(p[k+1]==p[q])
k+=1
pref[q]=k
}
o Compute KM Matching Algorithm
n = Length(Text)
m = Length(pattern)
pref[] = compute_Prefix(pattern)
k=0
for(q=0; q<n ; q++)
{
while(k>0 && p[k+1] != T[q])
k = pref[k]
if(p[k+1]==T[q])
k+=1
if k==m
print “Pattern occurs with shift”, q-m
}
o Complexity: Time Complexity O(n)
o Example: pattern=”ababaca” and Text=”bacbaababacacbab”
Computer Prefix value for Pattern:
a b a b a c a
0 0 1 2 3 0 1
Pattern Exist: Location 5
17 | P a g e
Insertion Sort
Goal or Aim or Objective: Sort the list of elements in either ascending or descending
order
Need for Insertion Sort: Improve the efficiency or time complexity of bubble or
selection sort.
It is one of the simple sorting algorithm for implement
Input:List of elements
Output: List of elements in either ascending or descending order
Basic Operation: Comparison
Algorithm Design technique: Decrease and Conquer
Swapping: No swapping is required
Assumption: First element in the list is in sorted order
Properties of Insertion Sort:
o Supports Stable property
o Supports In-place property
Working Principle:
1. Assume the first element in the list is always sorted
2. Index starts from the first index, find the location of the element in sorted order
of the list in left side of the list and insert the element in the identified location.
3. The above step is repeated until the last element in the list
Algorithm:
ALGORITHM insertion_sort(a[],n)
//problem description: Arrange the list of elements in either ascending or descending
order
//Input: List of elements (a[]) and number of elements (n)
//Output: Ascending or descending order (a[])
for(i=1; i<n; i++)
{
copy=a[i]
18 | P a g e
for(j=i-1; a[j]> copy && j>=0 ; j--)
a[j+1] = a[j]
a[j+1] = copy
}
Analysis of Insertion Sort:
o Best case Analysis: The input list is in ascending order (so inner won‟t be executed)
So best case complexity T(n) ≈ O(n)
o Worst Case Complexity: The input is in descending order
T(n) ≈ O(n2)
o Average Case: T(n) ≈ O(n2)
Space Complexity Analysis: Only one extra memory required (copy). S(n) ≈ O(1)
Suitable Application:
a. Only few elements are not sorted and the remaining elements in the list is already
sorted
b. Number of data elements in the list is small
19 | P a g e
Heap Sort
Goal or Aim or Objective: Sort the list of elements in either ascending or descending
order
Need for Heap Sort: Improve the efficiency or time complexity of sorting algorithm.
Input: List of elements (a[]), Number of Elements (n)
Output: List of elements in either ascending or descending order
Basic Operation: Comparison
Algorithm Design technique: Transform and Conquer
Swapping: swapping is required (heapify)
Definition of Heap Tree: It is a binary tree with keys assigned its nodes based on the
following conditions.
a. Tree shape requirement: The binary tree must be essentially complete (or)
complete tree. i.e., all its levels are full except possibly the last level, where rightmost
leaves may be missing.
b. Parental Dominance (or) Heapify Property (or) Requirement: The key at each
node is greater than or equal to its children in case of max heap and small in case of
min heap.
Types of Heap:
a. Max heap – the value of the parent node is greater than the child node
b. Min heap – the value of the parent node is minimum compared to child node.
Characteristics of Heap Tree:
a. Height: The height of the heap tree with n node is log2n.
b. Parent Node: The value of the parent node either minimum or maximum compared
to its children.
c. Descendant or Successor: It is also heap
d. Position of parent and child nodes: In array representation, the parent node
occupies the first n/2 positions and the remaining n/2 elements holds the child
element
e. Index: Array index starts with 1
20 | P a g e
Construction of Heap Tree:
a. Bottom – up construction (construct the entire tree and verify the properties)
b. Top – down Construction (for each element insertion verify the parental
dominance property)
Algorithm for Heap Sort:
ALGORITHM heap_sort(a[], n)
//Problem Description: Arrange the list of elements in ascending or descending order
//Input: List of elements (a[]) and number of elements (n)
//Output: Ascending or descending order (a[])
for(m=n; m>0 ; m--)
{
for(i=m/2 ; i>=1; i--)
{
for(k=i; 2*k<=m ; )
{
j=2*k;
if(j<m)
{
if(a[j] < a[j+1])
j=j+1
}
if(a[k]>=a[j])
break;
else
{
t=a[k]; a[k]=a[j]; a[k] =t; k=j;
}
}
}
t=a[0] ; a[0] = a[m] ; a[m] =t;
21 | P a g e
}
Analysis of Time Complexity:
a. Best Case Analysis : O(n) – need to perform max deletion
b. Average case Analysis : O(n log n)
c. Worst Case Analysis : O(nlogn)
Space Complexity : O(1) - No extra spaces are required
Stable: No
In-place Algorithm: Yes
Example: sort the list of elements {23,5,6,8,19,32,45,67,6,17} into ascending order
22 | P a g e
Unit - 2
GRAPH ALGORITHMS
GRAPH
Graph: It is a non-linear data structure which represents the set of vertices and edges.
G = (V,E).
Components of a Graph:
a. Vertices or Nodes - Fundamental units of a graph (store the information)
b. Edges or Arcs – It is used to connect two vertices in a graph (represent the
relationship between vertices)
Terminologies used in Graph:
a. Path – Sequence of nodes that are followed in order to reach the particular nodes or
vertex from the starting node
b. Closed path – In path, the initial and terminating vertex should be same
c. Adjacency - Two nodes or vertices are adjacent if there are directly connected to
each other.
d. Cycle – The path has no repeated edges or vertices in cycle except start and last
vertices.
e. Complete Graph – Every node is connected to all other vertices in a graph (it
contains n(n-1)/2 edges for n vertices)
f. Connected Graph – The path exist between every pair of vertices (no isolated
nodes in a graph)
g. Degree of a node – The number of edges that are connected with the node (a node
with 0 degree is called is isolated vertex)
h. Cycle graph – A simple graph of „n‟ nodes and n edges forming a cycle of length
„n‟ (All the vertices are of degree 2)
Types of Graph:
1. Directed or Undirected graph
2. Weighted and un-weighted graph
23 | P a g e
Representation of Graph (or) Memory organization for storing graph: There are two
ways the graph of stored or organized.
1. Adjacency Matrix Representation (or) Sequential Representation
2. Adjacency List Representation (or) Linked list representation
1. Adjacency Matrix Representation: 2D Array Representation – Square Matrix
- It is used to represent the graph with the order of |V| * |V|.
- For undirected graph with un-weighted, the values are represented by either 0 or 1.
- For directed with un-weighted graph, the values are represented as 0 for self loop not
exist, ∞ for no path exist and 1 for path exist.
- Directed graph with weighted graph, the values are represented as 0 for self loop not
exist, ∞ for no path exist and assigned weight for path exist.
- Pros of Adjacency Graph:
1. Easy to insert and delete a node from the graph
2. It is more suitable for dense graph.
- Cons of Adjacency List:
1. It consumes more space
2. It consumes more time to find inEdges and OutEdges of all the nodes.
- Example:
Directed Graph Adjacency Matrix
24 | P a g e
- Pros of Adjacency List:
a. Wastage of memory reduced for sparse graph
b. It is sued to find the entire nodes to the next node easily.
- Cons of Adjacency List:
a. Find a particular node, need to explore all the connected nodes. So it is slower
method.
b. It is not suitable for dense graph.
- Example:
Applications of Graph
o Applications:
a. It helps to define the flow of computations of software programs
b. It is used in Google maps for implementing transportation applications
c. It is used in Social networks
d. It is used in operating system to detect deadlocks
e. It is used in World Wide Web searching operations
f. It is used for modelling data (data organization)
g. It is used to define the network of communications
h. It is used for designing electric circuits
25 | P a g e
GRAPH TRAVERSALS (Graph Search)
o Goal: Visit all the vertices in a given graph. (Traversals – visit the vertex). Traversal is
the process of visiting all the vertices in a graph.
o It is used to search for a vertex in a graph.
o It is used to decide the order of the vertices to be visited in the search process.
o No Loop: It is sued to find the list of edges to be used in search process without creating
loops.
o Types of Graph Traversal:
a. Breadth First Search (BFS)
b. Depth First Search (DFS)
26 | P a g e
Concept:
a. Initially starting vertex is enqueued into the queue and marks the visited vertex
array as 1 and considered as current vertex.
b. Find the adjacent vertex for the current vertex, enqueue all the adjacent vertices in
the queue and mark all the adjacent vertices visited array as 1.
c. Dequeue the vertex from the queue and repeat step 2 untill all the vertices are
visited.
BFS Algorithm:
o Algorithm BFS(G)
//Implements a breadth-first search traversal of a given graph
//Input: Graph G = (V,E)
//Output: Graph G with its vertices marked with consecutive integers in the order they
are visited by the BFS traversal
mark each vertex in V with 0 as a mark of being “unvisited”
count ← 0
for each vertex v in V do
if v is marked with 0
bfs(v)
Algorithm dfs(v)
//Problem Description: visits all the unvisited vertices connected to vertex v by a path
and numbers them in the order they are visited via global
variable count
count ← count + 1;
mark v with count and initialize a queue with v
while the queue is not empty do
for each vertex w in V adjacent to the front vertex do
if w is marked with 0
count ← count + 1; mark w with count
add w to the queue
remove the front vertex from the queue
27 | P a g e
Running time of DFS:
a. Adjacency Matrix Representation : (|V|2) // |V| - number of vertices
b. Adjacency List Representation : (|V|+|E|) // |E| - number of edges
Example: Find the graph traversal for the following graph using BFS.
Example 2:
Applications of BFS:
a. Shortest path and minimum spanning tree for unweighted graph
b. Minimum spanning tree for weighted graphs
c. Finding neighbour nodes in peer to peer networks
d. Crawlers in web search engine (building index for search results)
e. Finding friends in Social networks
f. GPS navigation Systems
g. Broadcasting in network communication
h. Cycle detection in undirected graph
i. Find the path between two vertices
j. Check whether the graph is bipartite or not
28 | P a g e
DEPTH FIRST SEARCH (DFS)
Goal: visiting the vertices one by one.
Input: Unweighted graph
Output: Depth First Forest (Tree edges and Back Edges)
Algorithm Design Technique : Backtracking
Data structure Used: stack (last in first out)
Additional data structures used: Visited vertex array – maintain the details about
visited and unvisited vertices (initially all the vertices are marked as 0).
Depth – First search Forest:
c. It is tree indicates the order in which the vertices are visited.
d. Root node of depth first forest – starting vertex should be a root node.
Tree Edge: If the new vertex is added into the tree is called tree edge.
Back Edge: There is a possible type of edge from previously visited vertex other than its
immediate predecessor (parent in the tree).
Processing Order: If the vertex is having more than one adjacent vertex, then the
vertices are visited in alphabetical order.
Concept:
d. Initially starting vertex is pushed into the stack and marks the visited vertex array
as 1.
e. Find the adjacent vertex for the starting vertex, push it into the stack and add the
vertex into depth first forest and mark it as 1.
f. But if the vertex is already visited then find the next adjacent vertex and repeat
the step2.
g. If all the vertices are not visited then pop the vertex from the stack and find the
adjacent vertex for the top of the vertex.
h. Repeat the step 2 to step 4 until all the vertices are visited.
29 | P a g e
DFS Algorithm:
o Algorithm DFS(G)
// Problem description: Find DFS for the given graph
//Input: Graph G
//Output: Visited order of the vertices along with back edges
Initialize 0 for each vertex (not visited)
count=0 //global variable
for( each vertex in V)
{
if(v is marked as 0)
dfs(v)
}
Algorithm dfs(v)
// Problem description: Recursively identify the vertices adjacent to current vertex
count=count+1;
mark v with count
for(for adjacent vertex w of v)
{
if(w is marked as 0)
dfs(w)
}
Running time of DFS:
c. Adjacency Matrix Representation : (|V|2) // |V| - number of vertices
d. Adjacency List Representation : (|V|+|E|) // |E| - number of edges
Example: Find the graph traversal for the following graph using DFS.
30 | P a g e
Applications of DFS:
a. DFS algorithm can be used to implement the topological sorting.
b. It can be used to find the paths between two vertices.
c. It can also be used to detect cycles in the graph.
d. DFS algorithm is also used for one solution puzzles.
e. DFS is used to determine if a graph is bipartite or not.
Difference between BFS and DFS
Breadth First Search Depth First Search
S.No. Characteristics
(BFS) (DFS)
1 Data Structure used Queue Stack
2 Number of vertex Ordering One ordering To ordering
3 Edge type (Undirected graphs) Tree and cross edges Tree and Back edges
4 Efficiency for Adjacency Matrix (|V|2) (|V|2)
5 Efficiency for Adjacency List (|V||E|) (|V||E|)
CONNECTIVITY IN GRAPH
Connected Graph: A graph is said to be connected if there is a path between every
pair of vertex (or) a graph is a connected graph if, for each pair of vertices, there exists
at least one single path which joins them.
It is a basic concept in graph theory.
Cut Vertices: A vertex V ∈ G is called a cut vertex of „G‟, if „G-V‟ (Delete „V‟ from
„G‟) results in a disconnected graph.
o Removing a cut vertex from a graph breaks it in to two or more graphs.
(Removing a cut vertex may render a graph disconnected).
o A connected graph „G‟ may have at most (n–2) cut vertices.
Cut Edge or Bridge: Let „G‟ be a connected graph. An edge „e‟ ∈ G is called a cut
edge if „G-e‟ results in a disconnected graph.
Edge Connectivity: The edge connectivity of a connected graph G is the minimum
number of edges whose removal makes G disconnected. It is denoted by λ(G).
31 | P a g e
Vertex Connectivity: The connectivity (or vertex connectivity) of a connected graph G
is the minimum number of vertices whose removal makes G disconnects or reduces to a
trivial graph. It is denoted by K(G).
Different Types of Connected Graph:
a. Fully connected Graph (or) Completed Graph – unique edges are used to
connected every vertices in a graph
- The number of edges = (n*(n-1))/2
b. K – Connected Graph – Smallest set of k-vertices is removed then the graph is
disconnected.
c. Strongly Connected Graph - Every vertex can be reached from every other
vertex in a graph
- In directed graph, if there exists a path in each possible direction between
each pair of vertices in the graph.
Verification of Connectivity in Graphs:
1. Breadth First Search (BFS)
2. Depth First Search (DFS)
Applications of Connectivity Concept:
a. Network Applications
b. Routing in Transportation Networks
c. Network tolerance
STRONG CONNECTIVITY
Definition: Connected with directed graph
- Every vertex can be reached from every other vertex in a graph
- In directed graph, if there exists a path in each possible direction between each pair
of vertices in the graph.
Maximum number of edges in strongly Connected Graph = n*(n-1)
If strongly connected graph contains more than one vertices then the graph is cycle.
32 | P a g e
Example:
BI - CONNECTIVITY
Definition: A graph is said to be bi-connected if: (more than two vertices)
a. It is a connected graph (simple path – path from one vertex to all other vertices in
the graph)
b. Even after removing any vertex, the graph still remains connected. (no
articulation point)
A connected graph is bi-connected if it is connected and doesn’t have any Articulation
Point or cut vertex.
Definition of Articulation Point: If the graph is divided into two or more sub-graphs
when the particular vertex is removed from the graph.
33 | P a g e
Methods for checking for Bi-Connectivity:
i. Use Brute force method – removing every vertex and check it for connected or
not. O(V*(V+E))
Applications:
i. Robustness in a network
ii. Routing in Transportation Networks
iii. Design of power grid networks
iv. Identify the influence node in social network analysis
Definition of Spanning Tree: It is a connected acyclic sub-graph that contains all the
vertices of the graph.
Minimum Spanning Tree Definition: It is a spanning tree of the smallest weight.
[Condition: Weighted connected graph].
Two methods:
a. Prim‟s algorithm
b. Kruskal algorithm
PRIMS ALGORITHM
Aim (or) Objective (or) Goal: Finding minimum spanning tree for the given connected
graph.
Input: Weighted Connected Graph (may or may not be a directed graph)
Output: Minimum spanning tree (edges with vertices) and minimum cost.
Algorithm Design Technique: Greedy Technique
Condition: Loop will not be created when vertices are connected by the edges. (cycle
should not be formed)
Basic Operation: Find minimum edge from the list of adjacent edges
Concept:
1. Choose any vertex from the set of vertices.
34 | P a g e
2. Expand the current tree by identify the nearest edges from the current tree. (Expand
only one vertex at a time on each iteration).
3. Above step2 is repeated for n-1 times. (Where n is the number of vertices in the
graph)
Algorithm or Pseudo code for prim’s algorithm
Algorithm prim(G)
//Problem Description: Find minimum spanning tree for the given graph
//Input: A weighted connected graph G=(V,E)
//Output: ET (set of edges in the tree with minimum cost)
VT = {V0} // start with the vertex V0
ET = NULL
for(i=1; i<n; i++)
{
Find a minimum weighted edge e*=(v*,u*) among all the edges (v,u)
Such that v is in VT and u is in V-VT
VT = VT ∪ {u*}
ET = ET ∪ {e*}
}
Example: Construct a Minimum Spanning Tree for the following graph using prim‟s
Algorithm
Solution:
Tree Remaining
Illustration (or) MST Description
Vertices Vertices
b(a,3)
a(-,-) e(a,6) Initial
f(a,5)
35 | P a g e
e(a,6)
f(a,5)
b(a,3) c(b,1) Selected edge b(a,3)
f(b,4)
e(a,6)
f(a,5)
c(b,1) f(b,4) Selected c(b,1)
f(c,4)
d(c,6)
e(a,6)
f(a,5)
f(c,4)
f(b,4) Selected (f,b,4)
d(c,6)
d(f,5)
e(f,2)
e(a,6)
f(a,5)
f(c,4) Selected
e(f,2)
d(c,6) e(f,2)
d(f,5)
d(e,8)
e(a,6)
- - d(c,6)
d(e,8) closed path.
36 | P a g e
Time Complexity (or) Running Time:
- Time Complexity: O (|E| log |V|) |V| - 1 ≤ |E|
Applications of Prim’s Algorithm:
a. Cable laying, electric grids layout, and LAN networks (the graphs are dense)
b. Network for roads and Rail tracks connecting all the cities.
c. Irrigation channels and placing microwave towers
d. Designing a fibber-optic grid or ICs.
e. Travelling Salesman Problem.
f. Cluster analysis.
g. Path finding algorithms used in AI(Artificial Intelligence).Game Development
KRUSKAL ALGORITHM
Aim (or) Objective (or) Goal: Finding minimum spanning tree for the given
connected graph.
Input: Weighted Connected Graph (may or may not be a directed graph). (Inputs are
arranged in ascending order using graph edges.)
Output: Minimum spanning tree (edges with vertices) and minimum cost.
Algorithm Design Technique: Greedy Technique
Condition: Loop will not be created when vertices are connected by the edges. (cycle
should not be framed)
Basic Operation: Find minimum edge from the list of sorted edges
Concept:
a. Choose any vertex from the set of vertices.
b. Identify the minimum edge cost in the graph and added in the sub tree if loop is not
framed.
c. Above step2 is repeated for n-1 times. (Where n is the number of vertices in the
graph)
37 | P a g e
Algorithm or Pseudocode for Kruskal algorithm
Algorithm kruskal(G)
//Problem Description: Find minimum spanning tree for the given graph
//Input: A weighted connected graph G=(V,E)
//Output: ET (set of edges in the tree with minimum cost)
ET = NULL
k=0
for(e=0; e<n; e++)
{
k=k+1
if (ET ∪ {eik} is acyclic )
{
ET = ET ∪ {eik}
e++;
}
}
Example: Construct a Minimum Spanning Tree for the following graph using Kruskal
Algorithm
Tree
Sorted list of edges Illustration (or) MST Description
Vertices
38 | P a g e
ab/3, bf/4, cf/4, af/5,
fe/2 fd/5, Selected Edge fe/2
cd/6,ae/6, ed/8
41 | P a g e
a b c d
0 10 3 4
- a-c-b a-c a-c-d
Edge (b,a): d[a] small. No relaxing is required.
Edge (d,a): d[a] small. No relaxing is required.
o Iteration 3: Considering each edge in the graph, perform edge relaxing
(a,c), (c,d), (c,b), (b,a), (d,a)
Edge (a,c): d[c] > d[a] + cost(a,c)
3 > 0 + 3 = 3 . No relaxing required.
a b c d
0 ∞ 3 ∞
- - a-c -
Edge (c,d): d[d] > d[c] +cost(c,d)
4 > 3 + 1 = 4. No relaxing required.
a b c d
0 ∞ 3 4
- - a-c a-c-d
Edge (c,b): d[b] > d[c] + cost(c,b)
10 > 3 + 7 = 10. No Relaxing required.
a b c d
0 10 3 4
- a-c-b a-c a-c-d
Edge (b,a): d[a] small. No relaxing is required.
Edge (d,a): d[a] small. No relaxing is required.
Step 3: checking for negative edge cycle if any. So perform edge relaxing once again.
o Considering each edge in the graph, perform edge relaxing
(a,c), (c,d), (c,b), (b,a), (d,a)
Edge (a,c): d[c] > d[a] + cost(a,c)
3 > 0 + 3 = 3 . No relaxing required.
a b c d
0 ∞ 3 ∞
- - a-c -
Edge (c,d): d[d] > d[c] +cost(c,d)
4 > 3 + 1 = 4. No relaxing required.
42 | P a g e
a b c d
0 ∞ 3 4
- - a-c a-c-d
Edge (c,b): d[b] > d[c] + cost(c,b)
10 > 3 + 7 = 10. No Relaxing required.
a b c d
0 10 3 4
- a-c-b a-c a-c-d
Edge (b,a): d[a] small. No relaxing is required.
Edge (d,a): d[a] small. No relaxing is required.
No changes in the path cost. Therefore, there is no edge weight found
in the given graph.
Solution:
Input graph Solution
Vertex Cost Path
a 0 -
b 10 a-c-b
c 3 a-c
d 4 a-c-d
Example 2: Find the single source shortest path from the vertex „a‟ using bellman ford
algorithm.
43 | P a g e
o Step 2: Iteration 1: Considering each edge in the graph, perform edge relaxing.
(A,B), (A,C), (A,D), (D,C), (C,B), (B,E), (C,E), (D,F), (E,F)
A B C D E F
0 1 3 5 0 3
- A-D-C-B A-D-C A-D A-D-C-B-E A-D-C-B-E-F
Step 3: checking for negative edge cycle if any. So perform edge relaxing once again.
(A,B), (A,C), (A,D), (D,C), (C,B), (B,E), (C,E), (D,F), (E,F)
A B C D E F
0 1 3 5 0 3
- A-D-C-B A-D-C A-D A-D-C-B-E A-D-C-B-E-F
No changes in the path cost. Therefore, there is no edge weight found
in the given graph.
44 | P a g e
Solution:
Input graph Solution
Vertex Cost Path
A 0 -
B 1 A-D-C-B
C 3 A-D-C
D 5 A-D
E 0 A-D-C-B-E
F 3 A-D-C-B-E-F
47 | P a g e
o Step 4: Apply Relaxing concept for the entire unvisited vertices; find the
distance for all the vertices from d.
Vertex b: d[b] = min (d[b] , distance[d] + cost(d,b) )
= min(10, 4+∞)
d[b] = 10 path : a-c-b //old path retained
From the list of distances of unvisited vertices select the vertices have the
minimum distance. Here d[b] has the minimum value.
visited = {a, c, d, b} unvisited = {}
current = {b}, path : a-c-b
All the vertices are processed. The algorithm is terminated.
Solution:
Input graph Solution
Vertex Cost Path
a 0 -
b 10 a-c-b
c 3 a-c
d 4 a-c-d
Time Complexity (or) Running Time: The running time of Dijkstra‟s algorithm is
depends on the data structures used for representing the inputs.
o Adjacency matrix: T(n) ≈ O (|V|2)
o Adjacency list T(n) ≈ O (|E| log |V|)
48 | P a g e
Weighted (or) Distance Matrix: It is the process of recording the length of shortest
paths in an n-by-n matrix is called distance matrix. (n – Number of vertices)
Adjacent Matrix: The graph is represented by using adjacency matrix.
Algorithm for Floyd’s:
//Problem Description: Find the shortest path cost from one vertex to some other
vertices.
//Input: The weighted matrix D(0)
//Output: The shortest path length of distance matrix.
for(k=0; k<n;k++)
{
for(i=0; i<n;i++)
{
for(j=0;j<n;j++)
{
Dk[i,j] = min{ Dk-1[i, j] , Dk-1[i , k] + Dk-1[k, j]}
}
}
}
Note: Here the index k defines the number of intermediate vertices for finding the
shortest path.
Time efficiency of Floyds algorithm: O (n3)
Example: Obtain all pair shortest paths for the following graph.
49 | P a g e
Step 2: By considering vertex ‘a’ is considered as intermediate vertex
50 | P a g e
Step 5: By considering vertex ‘d’ is considered as intermediate vertex
The final distance matrix is, (shortest path between all the vertices are given below)
Step 1: The Initial transitive closure of the given graph is (Adjacency representation)
a b c d
a 0 1 0 0
R0 = b 0 0 0 1
c 0 0 0 0
d 1 0 1 0
Step 2: The Transitive closure of the graph by considering the intermediate vertex „a‟
a b c d
a 0 1 0 0
1
R = b 0 0 0 1
c 0 0 0 0
d 1 1 1 0
Step 3: The Transitive closure of the graph by considering the intermediate vertex „b‟
a b c d
a 0 1 0 1
2
R = b 0 0 0 1
c 0 0 0 0
d 1 1 1 1
52 | P a g e
Step 4: The Transitive closure of the graph by considering the intermediate vertex „c‟
a b c d
a 0 1 0 1
3
R = b 0 0 0 1
c 0 0 0 0
d 1 1 1 1
Step 5: The Transitive closure of the graph by considering the intermediate vertex „d‟
a b c d
a 1 1 1 1
4
R = b 1 1 1 1
c 0 0 0 0
d 1 1 1 1
Therefore the final transitive closure of the graph is,
a b c d
a 1 1 1 1
4
R = b 1 1 1 1
c 0 0 0 0
d 1 1 1 1
53 | P a g e
Value of the flow (v): (total outflow from the source or total inflow into the sink)
54 | P a g e
Initial step: Start with zero flow.
Flow augmenting: In each iteration, try to find the path from source to sink along
which some additional flow can be sent. (if no path found, the current flow is optimal)
It is also called as ford – Fulkerson method or augmenting path method.
Example:
Search for flow augmenting path from source to sink - current flow xij is less than the
capacity uij.
Cut: Partitioning of networks into subset cut(X, X‟). If X containing the source and
X‟ containing sink vertex
Example: a) Cut(X,X‟) = {(1,2), (1,4)} if X={1} and X‟={2,3,4,5,6}
b) Cut(X, X‟) = {(3,6), (5,6)} if X={1,2,3,4,5} and X‟={6}
Properties of cut: If all the edges of a cut deleted from the network, then there is no
directed path from source to sink.
Theorem: The value of maximum flow in a network is equal to the capacity of its
minimum cost.
Assume x be a feasible flow value(v) and let C(X, X‟) be a cut of capacity c in the
same network.
Flow across this cut = sum of the flows on the edges from X to X‟ +
sum of the flows of the edges from X‟ to X
Flow across the cut C(X, X‟) is equal to v,
55 | P a g e
Flow of xij cannot exceed the capacity uij. Equation (1)
Therefore, the value of any feasible flow in a network cannot exceed the
capacity of any cut in that network.
Find cut: let v* be the value of final flow x* obtained by the augmenting – path
method.
- Need find the cut – capacity is equal to v* from (1) conclude that
a) v* is the final flow is maximum among all feasible flows
b) the cut capacity is minimal among all cuts in the network
c) The maximum flow value = minimum- cut capacity
Complexity:
- Number of augmenting path = |V|*|E| / 2
- Time required to find a shortest augmenting path by BFS in O(|E|+|V|) = O(|E|) – in
adjacency list representation
- Time Efficiency of shortest augmenting path algorithm = O(|V|*|E|2)
56 | P a g e
MAXIMUM MATCHING IN BIPARTITE GRAPHS
Goal: Paring of elements in two sets (finding the maximum matching in a graph)
Matching in a graph: It is a subset of its edges with the property that no two edges
share a vertex
Maximum matching (or) Maximum Cardinality Matching: It is a matching with
largest number of edges
Bipartite Graph: All the vertices are partitioned into two different sets (U and V) of
same or different size so that every edge connects a vertex in one these sets to a vertex
in another set. (color vertices means, two vertices are colored by different color)
A bipartite graph is possible if the graph coloring is possible using two colors such that
vertices in a set are colored with the same color have a cycle of even length.
Odd Length: If the graph is bipartite graph if and only if it does not have a cycle of odd
length.
Algorithm: Graph coloring problem (two colouring problem)
Time Complexity :
o Adjacency Matrix T(n) ≈ O(V*V)
o Adjacency List T(n) ≈ O(V+E)
Applications of Bipartite Graphs:
a. Bipartite graph can be used in the medical field in the detection of lung cancer,
throat cancer etc.
b. Used in search advertising and e-commerce for similarity ranking.
c. Predict movie preferences of a person.
d. Stable marriage and other matching problems.
57 | P a g e