0% found this document useful (0 votes)
86 views11 pages

DAA - Module 4 Notes 4TH SEM

The document provides an overview of the greedy method, including its principles and applications such as minimum cost spanning trees and single-source shortest paths. It details Prim's and Kruskal's algorithms for finding minimum spanning trees, as well as Dijkstra's algorithm for shortest paths, highlighting their efficiency and use cases. Additionally, it discusses Huffman's algorithm for variable-length encoding, demonstrating how it optimizes data compression by assigning shorter codes to more frequent characters.

Uploaded by

cadono3270
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
86 views11 pages

DAA - Module 4 Notes 4TH SEM

The document provides an overview of the greedy method, including its principles and applications such as minimum cost spanning trees and single-source shortest paths. It details Prim's and Kruskal's algorithms for finding minimum spanning trees, as well as Dijkstra's algorithm for shortest paths, highlighting their efficiency and use cases. Additionally, it discusses Huffman's algorithm for variable-length encoding, demonstrating how it optimizes data compression by assigning shorter codes to more frequent characters.

Uploaded by

cadono3270
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 11

ADA Module 4: Greedy Method

1. Introduction to Greedy method


1.1 General method
The greedy method is the straight forward design technique applicable to variety of
applications.
The greedy approach suggests constructing a solution through a sequence of steps, each
expanding a partially constructed solution obtained so far, until a complete solution to the
problem is reached. On each step the choice made must be:
 feasible, i.e., it has to satisfy the problem’s constraints
 locally optimal, i.e., it has to be the best local choice among all feasible choices
available on that step
 irrevocable, i.e., once made, it cannot be changed on subsequent steps of the algorithm
As a rule, greedy algorithms are both intuitively appealing and simple. Given an optimization
problem, it is usually easy to figure out how to proceed in a greedy manner, possibly after
considering a few small instances of the problem. What is usually more difficult is to prove
that a greedy algorithm yields an optimal solution (when it does).

www.techjourney.in
Anusha K, Asst. Professor, Kalpataru Institute of Technology
ADA Module 4: Greedy Method

2. Minimum cost spanning trees


Definition: A spanning tree of a connected graph is its connected acyclic subgraph (i.e., a tree)
that contains all the vertices of the graph. A minimum spanning tree of a weighted connected
graph is its spanning tree of the smallest weight, where the weight of a tree is defined as the
sum of the weights on all its edges. The minimum spanning tree problem is the problem of
finding a minimum spanning tree for a given weighted connected graph.

2.1. Prim’s Algorithm


Prim's algorithm constructs a minimum spanning tree through a sequence of expanding sub-
trees. The initial subtree in such a sequence consists of a single vertex selected arbitrarily from
the set V of the graph's vertices. On each iteration it expands the current tree in the greedy
manner by simply attaching to it the nearest vertex not in that tree. The algorithm stops after
all the graph's vertices have been included in the tree being constructed. Since the algorithm
expands a tree by exactly one vertex on each of its iterations, the total number of such iterations
is n - 1, where n is the number of vertices in the graph. The tree generated by the algorithm is
obtained as the set of edges.

Correctness: Prim’s algorithm always yields a minimum spanning tree.

www.techjourney.in
Anusha K, Asst. Professor, Kalpataru Institute of Technology
ADA Module 4: Greedy Method

Example: An example of prim’s algorithm is shown below.


The parenthesized labels of a vertex in the middle column
indicate the nearest tree vertex and edge weight; selected
vertices and edges are shown in bold.

Tree vertices Remaining vertices Illustration

Analysis of Efficiency
The efficiency of Prim’s algorithm depends on the data structures chosen for the graph itself
and for the priority queue of the set V − VT whose vertex priorities are the distances to the
nearest tree vertices.
1. If a graph is represented by its weight matrix and the priority queue is implemented as
an unordered array, the algorithm’s running time will be in Θ(|V|2). Indeed, on each

www.techjourney.in
Anusha K, Asst. Professor, Kalpataru Institute of Technology
ADA Module 4: Greedy Method

of the |V| − 1iterations, the array implementing the priority queue is traversed to find
and delete the minimum and then to update, if necessary, the priorities of the remaining
vertices.
We can implement the priority queue as a min-heap. (A min-heap is a complete binary tree in
which every element is less than or equal to its children.) Deletion of the smallest element from
and insertion of a new element into a min-heap of size n are O(log n) operations.
2. If a graph is represented by its adjacency lists and the priority queue is implemented
as a min-heap, the running time of the algorithm is in O(|E| log |V |).
This is because the algorithm performs |V| − 1 deletions of the smallest element and makes |E|
verifications and, possibly, changes of an element’s priority in a min-heap of size not exceeding
|V|. Each of these operations, as noted earlier, is a O(log |V|) operation. Hence, the running
time of this implementation of Prim’s algorithm is in
(|V| − 1+ |E|) O (log |V |) = O(|E| log |V |) because, in a connected graph, |V| − 1≤ |E|.

2.2. Kruskal’s Algorithm


Background: Kruskal's algorithm is another greedy algorithm for the minimum spanning tree
problem that also always yields an optimal solution. It is named Kruskal's algorithm, after
Joseph Kruskal. Kruskal's algorithm looks at a minimum spanning tree for a weighted
connected graph G = (V, E) as an acyclic sub graph with |V | - 1 edges for which the sum of
the edge weights is the smallest. Consequently, the algorithm constructs a minimum spanning
tree as an expanding sequence of sub graphs, which are always acyclic but are not necessarily
connected on the intermediate stages of the algorithm.
Working: The algorithm begins by sorting the graph's edges in non-decreasing order of their
weights. Then, starting with the empty subgraph, it scans this sorted list adding the next edge
on the list to the current sub graph if such an inclusion does not create a cycle and simply
skipping the edge otherwise.

www.techjourney.in
Anusha K, Asst. Professor, Kalpataru Institute of Technology
The fact that ET ,the set of edges composing a minimum spanning tree of graph G actually a
tree in Prim's algorithm but generally just an acyclic sub graph in Kruskal's algorithm.

Kruskal’s algorithm is not simpler because it has to check whether the addition of the next
edge to the edges already selected would create a cycle.

We can consider the algorithm's operations as a progression through a series of forests


containing all the vertices of a given graph and some of its edges. The initial forest consists of
|V| trivial trees, each comprising a single vertex of the graph. The final forest consists of a
single tree, which is a minimum spanning tree of the graph. On each iteration, the algorithm
takes the next edge (u, v) from the sorted list of the graph's edges, finds the trees containing the
vertices u and v, and, if these trees are not the same, unites them in a larger tree by adding the
edge (u, v).

Analysis of Efficiency
The crucial check whether two vertices belong to the same tree can be found out using union-
find algorithms.
Efficiency of Kruskal’s algorithm is based on the time needed for sorting the edge weights of
a given graph. Hence, with an efficient sorting algorithm, the time efficiency of Kruskal's
algorithm will be in O (|E| log |E|).

Illustration
An example of Kruskal’s algorithm is shown below. The
selected edges are shown in bold.
3. Single source shortest paths
Single-source shortest-paths problem is defined as follows. For a given vertex called the
source in a weighted connected graph, the problem is to find shortest paths to all its other
vertices. The single-source shortest-paths problem asks for a family of paths, each leading from
the source to a different vertex in the graph, though some paths may, of course, have edges in
common.
3.1. Dijkstra's Algorithm
Dijkstra's Algorithm is the best-known algorithm for the single-source shortest-paths problem.
This algorithm is applicable to undirected and directed graphs with nonnegative weights only.
Working - Dijkstra's algorithm finds the shortest paths to a graph's vertices in order of their
distance from a given source.
 First, it finds the shortest path from the source to a vertex nearest to it, then to a second
nearest, and so on.
 In general, before its ith iteration commences, the algorithm
has already identified the shortest paths to i-1 other vertices
nearest to the source. These vertices, the source, and the
edges of the shortest paths leading to them from the source
form a subtree Ti of the given graph shown in the figure.
 Since all the edge weights are nonnegative, the next vertex
nearest to the source can be found among the vertices adjacent to the vertices of Ti. The
set of vertices adjacent to the vertices in Ti can be referred to as "fringe vertices"; they
are the candidates from which Dijkstra's algorithm selects the next vertex nearest to the
source.
 To identify the ith nearest vertex, the algorithm computes, for every fringe vertex u, the
sum of the distance to the nearest tree vertex v (given by the weight of the edge (v, u))
and the length d., of the shortest path from the source to v (previously determined by
the algorithm) and then selects the vertex with the smallest such sum. The fact that it
suffices to compare the lengths of such special paths is the central insight of Dijkstra's
algorithm.
 To facilitate the algorithm's operations, we label each vertex with two labels.
o The numeric label d indicates the length of the shortest path from the source to this
vertex found by the algorithm so far; when a vertex is added to the tree, d indicates
the length of the shortest path from the source to that vertex.
o The other label indicates the name of the next-to-last vertex on such a path, i.e.,
the parent of the vertex in the tree being constructed. (It can be left unspecified for
the sources and vertices that are adjacent to none of the current tree vertices.)
With such labeling, finding the next nearest vertex u* becomes a simple task of finding
a fringe vertex with the smallest d value. Ties can be broken arbitrarily.
 After we have identified a vertex u* to be added to the tree, we need to perform two
operations:
o Move u* from the fringe to the set of tree vertices.
o For each remaining fringe vertex u that is connected to u* by an edge of weight
w(u*, u) such that du*+ w(u*, u) <du, update the labels of u by u* and du*+ w(u*,
u), respectively.

Illustration: An example of Dijkstra's algorithm is


shown below. The next closest vertex is shown in
bold. (see the figure in next page)

The shortest paths (identified by following nonnumeric labels backward from a destination
vertex in the left column to the source) and their lengths (given by numeric labels of the tree
vertices) are as follows:

The pseudocode of Dijkstra’s algorithm is given below. Note that in the following pseudocode,
VT contains a given source vertex and the fringe contains the vertices adjacent to it after
iteration 0 is completed.
Analysis:
The time efficiency of Dijkstra’s algorithm depends on the data structures used for
implementing the priority queue and for representing an input graph itself.
Efficiency is Θ(|V|2) for graphs represented by their weight matrix and the priority queue
implemented as an unordered array.
For graphs represented by their adjacency lists and the priority queue implemented as a min-
heap, it is in O (|E| log |V| )
Applications
 Transportation planning and packet routing in communication networks, including the
Internet
 Finding shortest paths in social networks, speech recognition, document formatting,
robotics, compilers, and airline crew scheduling.

4. Optimal Tree problem


Background:
Suppose we have to encode a text that comprises characters from some n-character alphabet
by assigning to each of the text's characters some sequence of bits called the codeword. There
are two types of encoding: Fixed-length encoding, Variable-length encoding
Fixed-length encoding: This method assigns to each character a bit string of the same length
m (m >= log2n). This is exactly what the standard ASCII code does.
One way of getting a coding scheme that yields a shorter bit string on the average is based on
the old idea of assigning shorter code-words to more frequent characters and longer code-words
to less frequent characters.
Variable-length encoding: This method assigns code-words of different lengths to different
characters, introduces a problem that fixed-length encoding does not have. Namely, how can
we tell how many bits of an encoded text represent the first character? (or, more generally, the
ith) To avoid this complication, we can limit ourselves to prefix-free (or simply prefix) codes.
In a prefix ode, no code word is a prefix of a codeword of another character. Hence, with such
an encoding, we can simply scan a bit string until we get the first group of bits that is a
codeword for some character, replace these bits by this character, and repeat this operation
until the bit string's end is reached.
If we want to create a binary prefix code for some alphabet, it is natural to associate the
alphabet's characters with leaves of a binary tree in which all the left edges are labelled by 0
and all the right edges are labelled by 1 (or vice versa). The codeword of a character can then
be obtained by recording the labels on the simple path from the root to the character's leaf.
Since there is no simple path to a leaf that continues to another leaf, no codeword can be a
prefix of another codeword; hence, any such tree yields a prefix code.
Among the many trees that can be constructed in this manner for a given alphabet with known
frequencies of the character occurrences, construction of such a tree that would assign shorter
bit strings to high-frequency characters and longer ones to low-frequency characters can be
done by the following greedy algorithm, invented by David Huffman.
4.1 Huffman Trees and Codes
Huffman's Algorithm
Step 1: Initialize n one-node trees and label them with the characters of the alphabet. Record
the frequency of each character in its tree's root to indicate the tree's weight. (More generally,
the weight of a tree will be equal to the sum of the frequencies in the tree's leaves.)
Step 2: Repeat the following operation until a single tree is obtained. Find two trees with the
smallest weight. Make them the left and right subtree of a new tree and record the sum of their
weights in the root of the new tree as its weight.
A tree constructed by the above algorithm is called a Huffmantree. It defines-in the manner
described-a Huffman code.
Example: Consider the five-symbol alphabet {A, B, C, D, _} with the following occurrence
frequencies in a text made up of
these symbols:
The Huffman tree construction
for the above problem is shown below:

The resulting codewords are as follows:


Hence, DAD is encoded as 011101, and 10011011011101 is decoded as BAD_AD.
With the occurrence frequencies given and the code word lengths obtained, the average
number of bits per symbol in this code is
2 *0.35 + 3 *0.1+ 2 *0.2 + 2 *0.2 + 3 *0.15 = 2.25.
Had we used a fixed-length encoding for the same alphabet, we would have to use at least 3
bits per each symbol. Thus, for this example, Huffman’s code achieves the compression ratio
(a standard measure of a compression algorithm’s effectiveness) of (3−2.25)/3*100%= 25%.
In other words, Huffman’s encoding of the above text will use 25% less memory than its fixed-
length encoding.

*****

You might also like