DAA 4th Unit Notes
DAA 4th Unit Notes
Introduction
• A greedy algorithm is an algorithm that always tries to find the best solution
for each sub-problem, with the hope that this will give an optimal solution
for the problem as a whole.
• Greedy algorithms are simple & straight forward and are easy to invent, easy
to implement and most of the time quite efficient.
• The various algorithm that use the concept of greedy approach are:
Prim’s Algorithm
Prim’s algorithm is used to find minimum spanning tree.
They are helpful in routing applications & communication networks.
Prim's Algorithm is a greedy algorithm that is used to find the minimum spanning tree
from a graph. Prim's algorithm finds the subset of edges that includes every vertex of the
graph such that the sum of the weights of the edges can be minimized.
Prim's algorithm starts with the single node and explores all the adjacent nodes with all
the connecting edges at every step. The edges with the minimal weights causing no
cycles in the graph got selected.
o First, we have to initialize an MST (Minimum Spanning Tree) with the randomly chosen
vertex.
o Now, we have to find all the edges that connect the tree in the above step with the new
vertices. From the edges found, select the minimum edge and add it to the tree.
o Repeat step 2 until the minimum spanning tree is formed.
Algorithm
1. Step 1: Select a starting vertex
2. Step 2: Repeat Steps 3 and 4 until there are fringe vertices
3. Step 3: Select an edge 'e' connecting the tree vertex and fringe vertex that has minimum weight
4. Step 4: Add the selected edge and the vertex to the minimum spanning tree T
5. [END OF LOOP]
6. Step 5: EXIT
Krushkal’s Algorithm
Kruskal Algorithm
The Kruskal Algorithm is used to find the minimum cost of a spanning tree. A spanning tree is a
connected graph using all the vertices in which there are no loops. In other words, we can say
that there is a path from any vertex to any other vertex but no loops.
The minimum spanning tree is a spanning tree that has the smallest total edge weight. The
Kruskal algorithm is an algorithm that takes the graph as input and finds the edges from the
graph, which forms a tree that includes every vertex of a graph.
The working of the Kruskal algorithm starts from the edges, which has the lowest weight and
keeps adding the edges until we reach the goal.
The following are the steps used to implement the Kruskal algorithm:
o First, sort the edges in the ascending order of their edge weights.
o Consider the edge which is having the lowest weight and add it in the spanning tree. If
adding any edge in a spanning tree creates a cycle then reject that edge.
o Keep adding the edges until we reach the end vertex.
Algorithm
Step 1: Create a forest F in such a way that every vertex of the graph is a sepa
rate tree.
Step 2: Create a set E that contains all the edges of the graph.
Step 3: Repeat Steps 4 and 5 while E is NOT EMPTY and F is not spanning
1. Dijkstra's Algorithm begins at the node we select (the source node), and it examines the graph to
find the shortest path between that node and all the other nodes in the graph.
2. The Algorithm keeps records of the presently acknowledged shortest distance from each node to
the source node, and it updates these values if it finds any shorter path.
3. Once the Algorithm has retrieved the shortest path between the source and another node, that
node is marked as 'visited' and included in the path.
4. The procedure continues until all the nodes in the graph have been included in the path. In this
manner, we have a path connecting the source node to all other nodes, following the shortest
possible path to reach each node.
Dijkstra's Algorithm:
Step 1: First, we will mark the source node with a current distance of 0 and set the rest of the
nodes to INFINITY.
Step 2: We will then set the unvisited node with the smallest current distance as the current
node, suppose X.
Step 3: For each neighbor N of the current node X: We will then add the current distance of X
with the weight of the edge joining X-N. If it is smaller than the current distance of N, set it as
the new current distance of N.
Step 5: We will repeat the process from 'Step 2' if there is any node unvisited left in the graph.
Lower Bounds
Lower bound: an estimate on a minimum amount of work needed to solve a given problem
Examples:
Number of comparisons needed to find the largest element in a set of n numbers
Number of comparisons needed to sort an array of size n
Number of comparisons necessary for searching in a sorted array
Number of multiplications needed to multiply two n-by-n matrices
Examples
Finding max element
Polynomial evaluation
Sorting
Element uniqueness
Hamiltonian circuit existence
Conclusions
May and may not be useful
Adversary Arguments
Adversary argument: a method of proving a lower bound by playing role
of adversary that makes algorithm work the hardest by adjusting input.
The adversary cannot lie, however.
Root Node: Root node is from where the decision tree starts. It represents the
entire dataset, which further gets divided into two or more homogeneous sets.
Leaf Node: Leaf nodes are the final output node, and the tree cannot be
segregated further after getting a leaf node.
Splitting: Splitting is the process of dividing the decision node/root node into
sub-nodes according to the given conditions.
Branch/Sub Tree: A tree formed by splitting the tree.
Pruning: Pruning is the process of removing the unwanted branches from the
tree.
Parent/Child node: The root node of the tree is called the parent node, and
other nodes are called the child nodes.
Advantages of the Decision Tree
o It is simple to understand as it follows the same process which a human
follows while making any decision in real-life.
o It can be very useful for solving decision-related problems.
o It helps to think about all the possible outcomes for a problem.
o There is less requirement of data cleaning compared to other algorithms.
P Class
The P in the P class stands for Polynomial Time. It is the collection of decision
problems (problems with a “yes” or “no” answer) that can be solved by a
deterministic machine in polynomial time.
The solution to P problems is easy to find.
P is often a class of computational problems that are solvable and tractable.
Tractable means that the problems can be solved in theory as well as in
practice. But the problems that can be solved in theory but not in practice are
known as intractable.
NP Class
The NP in NP class stands for Non-deterministic Polynomial Time. It is the
collection of decision problems that can be solved by a non-deterministic
machine in polynomial time.
Features:
The solutions of the NP class are hard to find since they are being solved by a
non-deterministic machine but the solutions are easy to verify.
Problems of NP can be verified by a Turing machine in polynomial time.
Example:
Let us consider an example to better understand the NP class. Suppose there is a
company having a total of 1000 employees having unique employee IDs. Assume
that there are 200 rooms available for them. A selection of 200 employees must
be paired together, but the CEO of the company has the data of some employees
who can’t work in the same room due to personal reasons.
This is an example of an NP problem. Since it is easy to check if the given choice
of 200 employees proposed by a coworker is satisfactory or not i.e. no pair taken
from the coworker list appears on the list given by the CEO. But generating such
a list from scratch seems to be so hard as to be completely impractical.
It indicates that if someone can provide us with the solution to the problem, we
can find the correct and incorrect pair in polynomial time. Thus for the NP class
problem, the answer is possible, which can be calculated in polynomial time.
This class contains many problems that one would like to be able to solve
effectively:
NP-complete class
A problem is NP-complete if it is both NP and NP-hard. NP-complete problems
are the hard problems in NP.
Features:
NP-complete problems are special as any problem in NP class can be
transformed or reduced into NP-complete problems in polynomial time.
If one could solve an NP-complete problem in polynomial time, then one
could also solve any NP problem in polynomial time.
Some example problems include:
1. Hamiltonian Cycle.
2. Satisfiability.
3. Vertex cover.