0% found this document useful (0 votes)
4 views31 pages

Module 4 ADA

Dynamic programming is an algorithm design technique developed by Richard Bellman for optimizing multistage decision processes, particularly effective for problems with overlapping subproblems. The document provides examples including the Coin-row problem, Change-making problem, and Coin-collecting problem, illustrating how dynamic programming can efficiently solve these issues by storing results of subproblems. Additionally, it discusses the Knapsack problem, Warshall's algorithm for transitive closure, and Floyd's algorithm for finding all pairs shortest paths.

Uploaded by

noksha910
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
4 views31 pages

Module 4 ADA

Dynamic programming is an algorithm design technique developed by Richard Bellman for optimizing multistage decision processes, particularly effective for problems with overlapping subproblems. The document provides examples including the Coin-row problem, Change-making problem, and Coin-collecting problem, illustrating how dynamic programming can efficiently solve these issues by storing results of subproblems. Additionally, it discusses the Knapsack problem, Warshall's algorithm for transitive closure, and Floyd's algorithm for finding all pairs shortest paths.

Uploaded by

noksha910
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 31

Analysis and Design of Algorithms BCS401

MODULE-4
DYNAMIC PROGRAMMING

• Dynamic programming is an algorithm design technique invented by a prominent U.S.


mathematician, Richard Bellman, in the 1950s as a general method for optimizing multistage
decision processes.

• Dynamic programming is a technique for solving problems with overlapping


subproblems. Typically, these subproblems arise from a recurrence relating a given
problem’s solution to solutions of its smaller subproblems.

• Rather than solving overlapping subproblems again and again, dynamic programming
suggests solving each of the smaller subproblems only once and recording the results in
a table from which a solution to the original problem can then be obtained.

Three Basic Examples


EXAMPLE 1 Coin-row problem

• Problem Statement: There is a row of n coins whose values are some positive
integers c1, c2, . . . , cn, not necessarily distinct.
• The goal is to pick up the maximum amount of money subject to the constraint
that no two coins adjacent in the initial row can be picked up.

• Let F(n) be the maximum amount that can be picked up from the row of n
coins.

• To derive a recurrence for F(n), we partition all the allowed coin selections into two
groups: those that include the last coin and those without it.
• The largest amount we can get from the first group is equal to,
cn+ F(n − 2)—the value of the nth coin plus the maximum amount we can pick up
from the first n − 2 coins.
• The maximum amount we can get from the second group is equal to,
F(n − 1) by the definition of F(n).

Dept., of CSE Page 1


Analysis and Design of Algorithms BCS401

Thus, we have the following recurrence subject to the obvious initial


conditions:
F(n) = max{cn + F(n − 2), F(n − 1)} for n > 1, (1)
F(0) = 0, F(1) = c1.
We can compute F(n) by filling the one-row table left to right.

The application of the algorithm to the coin row of denominations 5, 1, 2, 10, 6,


2 is shown in Figure 8.1.

It yields the maximum amount of 17.

• ofTo
Dept., CSEfind the coins with the maximum total value found, we need to Page 2
Analysis and Design of Algorithms BCS401

backtrace the computations to see which of the two possibilities:


cn + F(n − 2) or F(n − 1) produced the maxima in formula (1).

• In the last application of the formula, it was the sum c6 + F(4), which means that the
coin c6 = 2 is a part of an optimal solution. Moving to computing F(4), the maximum
was produced by the sum c4 + F(2), which means that the coin c4 = 10 is a part of an
optimal solution as well.

• Finally, the maximum in computing F(2) was produced by F(1), implying that the coin
c2 is not the part of an optimal solution and the coin c1= 5 is. Thus, the optimal solution
is {c1, c4, c6}.
Analysis:
Using the CoinRow to find F(n), the largest amount of money that can be picked up, as
well as the coins composing an optimal set, clearly takes Ɵ(n) time and Ɵ(n) space.

EXAMPLE 2 Change-making problem


(1. Explain Coin Change Problem with example 06M)
Problem Statement: Consider the general instance of the following well-known problem. Give
change for amount n using the minimum number of coins of denominations,
d1<d2 < . . .<dm.
Here, we consider a dynamic programming algorithm for the general case, assuming
availability of unlimited quantities of coins for each of the m denominations d1< d2 < . . . <
dm where d1 = 1.

• Let F(n) be the minimum number of coins whose values add up to n;


• It is convenient to define F(0) = 0.
• The amount n can only be obtained by adding one coin of denomination
dj to the amount n − dj for j = 1, 2, . . . , m such that n ≥ dj .

• Therefore, we can consider all such denominations and select the one
minimizing, F(n − dj ) + 1.
• Since 1 is a constant, we can, of course, find the smallest F(n − dj )first and then
add 1 to it. Hence, we have the following recurrence for F(n):

Dept., of CSE Page 3


Analysis and Design of Algorithms BCS401

Example: The application of the algorithm to amount n = 6 and denominations 1, 3, 4 is


shown in Figure 8.2. The answer it yields is two coins.

The time and space efficiencies of the algorithm are obviously O(nm) and
Ɵ(n), respectively.

EXAMPLE 3 Coin-collecting problem


Dept., of CSE Page 4
Analysis and Design of Algorithms BCS401

Problem Statement:

Several coins are placed in cells of an n × m board, no more than one coin per cell.

A robot, located in the upper left cell of the board, needs to collect as many of the coins as
possible and bring them to the bottom right cell.

On each step, the robot can move either one cell to the right or one cell down from its current
location. When the robot visits a cell with a coin, it always picks up that coin.

Design an algorithm to find the maximum number of coins the robot can collect and a path it
needs to follow to do this.
• Let F(i, j) be the largest number of coins the robot can collect and bring to the cell (i, j)
• It can reach this cell either from the adjacent cell (i − 1, j) above it or from the adjacent
cell (i, j − 1) to the left of it.
• The largest numbers of coins that can be brought to these cells are F(i − 1, j) and F(i, j
− 1), respectively.
• The largest number of coins the robot can bring to cell (i, j ) is the maximum of these
two numbers plus one possible coin at cell (i, j ) itself. In other words, we have the
following formula for F(i, j):

where cij = 1 if there is a coin in cell (i, j ), and


cij = 0 otherwise.

Dept., of CSE Page 5


Analysis and Design of Algorithms BCS401

Example

Analysis: The optimal path can be obtained in Ɵ(n + m) time.

The Knapsack Problem and Memory Functions


(1. Design an algorithm to solve Knapscak problem using dynamic programming. Apply the
same to solve the following Knapsack problem where W=50 10M

2. Consider the following instance to solve the knapsack problem using Dynamic Programming

10M
3. Solve the instance of 0/1 Knapsack problem using dynamic programming approach 10M
Dept., of CSE Page 6
Analysis and Design of Algorithms BCS401

Problem statement:
Given n items of known weights w1, . . . , wn and values v1, . . . , vn and a knapsack of
capacity W, find the most valuable subset of the items that fit into the knapsack.
• To design a dynamic programming algorithm, we need to derive a recurrence relation
that expresses a solution to an instance of the knapsack problem in terms of solutions to
its smaller sub-instances.
• Let us consider an instance defined by the first i items, 1≤ i ≤ n, with weights w1, . . . ,
wi, values v1, . . . , vi , and knapsack capacity j, 1 ≤ j ≤ W.
• Let F(i, j) be the value of an optimal solution to this instance.
• We can divide all the subsets of the first i items that fit the knapsack of capacity j into
two categories: those that do not include the ith item and those that do.
1. Among the subsets that do not include the ith item, the value of an optimal subset
is, by definition, F(i − 1, j).
2. Among the subsets that do include the ith item (hence, j − wi ≥ 0), an optimal subset
is made up of this item and an optimal subset of the first i − 1 items that fits into the
knapsack of capacity j − wi . The value of such an optimal subset is vi+ F(i − 1, j −
wi).
• These observations lead to the following recurrence:

• It is convenient to define the initial conditions as follows:


F(0, j) = 0 for j ≥ 0 and
F(i, 0) = 0 for i ≥ 0.
• Our goal is to find F(n, W), the maximal value of a subset of the n given items that fit
into the knapsack of capacity W, and an optimal subset itself.

Dept., of CSE Page 7


Analysis and Design of Algorithms BCS401

Solution:
Given, n=4 and W=5. So construct a matrix F of n+1 rows and W+1 columns. As per the basic
condition if i=0 or j=0, F[i, j]=0.
So the initial table will be as shown below:

Capacity of the bag j


Items
i 0 1 2 3 4 5

0 0 0 0 0 0 0

1 0

2 0

3 0

4 0

Applying the formula for i=1, 2, 3, 4 and j=1, 2, 3, 4, 5 we get following updated table,

Dept., of CSE Page 8


Analysis and Design of Algorithms BCS401

The time efficiency and space efficiency of this algorithm are both in Ɵ(nW).

Memory Functions
• The classic dynamic programming approach, works bottom up: it fills a table with
solutions to all smaller subproblems, but each of them is solved only once.
• An unsatisfying aspect of this approach is that solutions to some of these smaller
subproblems are often not necessary for getting a solution to the problem given.
• Since this drawback is not present in the top-down approach, it is natural to try to
combine the strengths of the top-down and bottom-up approaches.
• The goal is to get a method that solves only subproblems that are necessary and does so
only once. Such a method exists; it is based on using memory functions.
• This method solves a given problem in the top-down manner.
• Initially, all the table’s entries are initialized with a special “null” symbol to indicate
that they have not yet been calculated.
• Thereafter, whenever a new value needs to be calculated, the method checks the
corresponding entry in the table first: if this entry is not “null”,it is simply retrieved from
the table; otherwise, it is computed by the recursive call whose result is then recorded in the
table.
• After initializing the table, the recursive function needs to be called with i
= n (the number of items) and j = W (the knapsack capacity).

Dept., of CSE Page 9


Analysis and Design of Algorithms BCS401

EXAMPLE 2 Let us apply the memory function method to the instance considered in
Example 1.

But the time complexity remains same as bottom-up approach: Ɵ(nW)

Dept., of CSE Page 10


Analysis and Design of Algorithms BCS401

Warshall’s and Floyd’s Algorithms

Transitive Closure
(1. Define Transitive Closure of a directed graph. Write Warshall’s algorithm to find
transitive closure. Apply the same to find the transitive closure of the digraph given below
in fig: 10M

2. With an algorithm, solve the below given graph to compute transitive closure 06M

3. Explain transitive closure of a directed graph and find the transitive closure for the given
graph 10M)

• The adjacency matrix A = {aij} of a directed graph is the boolean matrix that has 1 in
its ith row and jth column if and only if there is a directed edge from the ith vertex to
the jth vertex.
• We may also be interested in a matrix containing the information about the existence of
directed paths of arbitrary lengths between vertices of a given graph.
• Such a matrix, called the transitive closure of the digraph.

DEFINITION The transitive closure of a directed graph with n vertices can be defined as the
n × n boolean matrix T = {tij}, in which the element in the ith row and the jth column is 1 if
there exists a nontrivial path (i.e., directed path of a positive length) from the ith vertex to the
jth vertex; otherwise, tij is 0.

Example:

Dept., of CSE Page 11


Analysis and Design of Algorithms BCS401

Warshall’s Algorithm

(1. Apply Warshall’s algorithm to find the transitive closure of the following graph 10M

2. Define Transitive Closure. Write Warshall’s algorithm to compute transitive closure.


Illustrate using the following directed graph. 10M

3. Define transitive closure of a graph. Apply Warshalls algorithm to compute transitive


closure of a directed graph 10M

• One of the most efficient algorithms to find the transitive closure of given digraph is
Warshall’s algorithm.
• Warshall’s algorithm constructs the transitive closure through a series of n
× n boolean matrices:

• Each of these matrices provides certain information about directed paths in the digraph.
• Specifically, the element r(k)ij in the ith row and jth column of matrix R(k) (i, j = 1, 2, . .
. , n, k = 0, 1, . . . , n) is equal to 1 if and only if there exists a directed path of a positive
Dept., of CSE Page 12
Analysis and Design of Algorithms BCS401

length from the ith vertex to the jth vertex with each intermediate vertex, if any,
numbered not higher than k.
• Thus, the series starts with R(0), which does not allow any intermediate vertices in its
paths; hence, R(0) is nothing other than the adjacency matrix of the digraph.
• R(1) contains the information about paths that can use the first vertex as intermediate;
R(2) contains path information with 2nd vertex as intermediate and so on.
• The last matrix in the series, R(n), reflects paths that can use all n vertices of the digraph
as intermediate and hence is nothing other than the digraph’s transitive closure.

• Formula is,

The time-complexity is Ɵ(n3)


Example

Dept., of CSE Page 13


Analysis and Design of Algorithms BCS401

Floyd’s Algorithm for the All-Pairs Shortest-Paths


Problem
( 1. Write an algorithm for Floyd’s computing all pairs shortest path. Derive its time
complexity 06M

3. Apply Floyd’s algorithm to find the all pairs shortest path for the given adjacency
matrix 10M

Dept., of CSE Page 14


Analysis and Design of Algorithms BCS401

• “Given a weighted connected graph (undirected or directed), the all-pairs shortest-


paths problem asks to find the distances—i.e., the lengths of the shortest paths—from
each vertex to all other vertices. This is one of several variations of the problem involving
shortest paths in graphs”.
• Applications: in communications, transportation networks, and operations research
• It is convenient to record the lengths of shortest paths in an n × n matrix D called the
distance matrix: the element dij in the ith row and the jth column ofthis matrix indicates
the length of the shortest path from the ith vertex to the jth vertex.
For example,

• Floyd’s algorithm computes the distance matrix of a weighted graph with


n vertices through a series of n × n matrices:

• Each of these matrices contains the lengths of shortest paths with certain constraints on
the paths considered for the matrix in question.
• Specifically, the element d(k)ij in the ith row and the jth column of matrix D(k) (i, j = 1,
2, . . . , n,k = 0, 1, . . . , n) is equal to the length of the shortest path among all paths from
the ith vertex to the jth vertex with each intermediate vertex, if any, numbered not higher
than k.

Dept., of CSE Page 15


Analysis and Design of Algorithms BCS401

The formula is,

The time-complexity is Ɵ(n3)

Example: Consider the following digraph. Apply Floyd’s algorithm to get all-pairs
shortest path.

Dept., of CSE Page 16


Analysis and Design of Algorithms BCS401

Greedy Technique

Prim’s Algorithm, Kruskal’s Algorithm, Dijkstra’s Algorithm, Huffman


Trees and Codes.

The greedy approach suggests constructing a solution through a sequence of steps, each
expanding a partially constructed solution obtained so far, until a complete solution to the
problem is reached. On each step the choice made must be:
➢ feasible, i.e., it has to satisfy the problem’s constraints
➢ locally optimal, i.e., it has to be the best local choice among all feasible choices
available on that step
➢ irrevocable, i.e., once made, it cannot be changed on subsequent steps of the algorithm
• These requirements explain the technique’s name: on each step, it suggests a “greedy”
grab of the best alternative available in the hope that a sequence of locally optimal
choices will yield a (globally) optimal solution to the entire problem.
Dept., of CSE Page 17
Analysis and Design of Algorithms BCS401

• The following problem arises naturally in many practical situations: given n points,
connect them in the cheapest possible way so that there will be a path between every
pair of points. It has direct applications to the design of all kinds of networks—
including communication, computer, transportation, and electrical—by providing the
cheapest way to achieve connectivity.

• We can represent the points given by vertices of a graph, possible connections by the
graph’s edges, and the connection costs by the edge weights. Then the question can be
posed as the minimum spanning tree problem, defined formally as follows.

DEFINITION A spanning tree of an undirected connected graph is its connected acyclic


subgraph (i.e., a tree) that contains all the vertices of the graph.
• If such a graph has weights assigned to its edges, a minimum spanning tree is its
spanning tree of the smallest weight, where the weight of a tree is defined as the sum of
the weights on all its edges.
• The minimum spanning tree problem is the problem of finding a minimum spanning
tree for a given weighted connected graph.

Example: Consider the following graph,

The possible minimum spanning trees that can be generated are:

Here T1 is the minimum spanning tree.


Two popular algorithms to solve minimum spanning tree problem are:
Dept., of CSE Page 18
Analysis and Design of Algorithms BCS401

➢ Prim’s algorithm
➢ Kruskal’s algorithm

Prim’s Algorithm
(1. Solve the below instance of Prim’s algorithm to compute minimum cost spanning tree.
Mention time complexity. 08M

2. Write an algorithm for minimum spanning tree using Prim’s 08M


3. Apply the prim’s algorithm to obtain minimum cost spanning tree for the given weighted
connected graph 06M

3. Define minimum spanning tree. Write Kruskal’s algorithm to find minimum spanning tree.
Illustrate with following undirected graph. 10M

4. Apply Prim’s and Kruskal’s algorithm to find the minimal cost spanning tree for the graph
given 10M

Prim’s algorithm constructs a minimum spanning tree through a sequence of expanding


subtrees. The initial subtree in such a sequence consists of a single vertex selected arbitrarily
from the set V of the graph’s vertices. On each iteration, the algorithm expands the current tree
in the greedy manner by simply attaching to it the nearest vertex not in that tree. The algorithm
stops after all the graph’s vertices have been included in the tree being constructed. Since the
algorithm expands a tree by exactly one vertex on each of its iterations, the total number of such
iterations is n − 1, where n is the number of vertices in the graph.

Dept., of CSE Page 19


Analysis and Design of Algorithms BCS401

Example:
Let Apply Prim’s algorithm to find the minimum cost spanning tree for the graph shown below:

The steps involved are shown below:

Dept., of CSE Page 20


Analysis and Design of Algorithms BCS401

Analysis:
The efficient is Prim’s algorithm depends on the data structures chosen for the graph itself and
for the priority queue of the set V − VT whose vertex priorities are the distances to the nearest
tree vertices.
➢ If a graph is represented by its weight matrix and the priority queue is implemented
as an unordered array, the algorithm’s running time will be in Ɵ(|V |2).

Dept., of CSE Page 21


Analysis and Design of Algorithms BCS401

Note: We can also implement the priority queue as a min-heap. Deletion of the smallest
element from and insertion of a new element into a min-heap of size n are O(log n) operations.

➢ If a graph is represented by its adjacency lists and the priority queue is implemented
as a min-heap, the running time of the algorithm is in,
O (|E| log |V |).

Kruskal’s Algorithm
Kruskal’s algorithm looks at a minimum spanning tree of a weighted connected graph G = (V,
E) as an acyclic subgraph with |V| − 1 edges for which the sum of the edge weights is the
smallest.
The algorithm begins by sorting the graph’s edges in nondecreasing order of their weights. Then,
starting with the empty subgraph, it scans this sorted list, adding the next edge on the list to the
current subgraph if such an inclusion does not create a cycle and simply skipping the edge
otherwise.

Dept., of CSE Page 22


Analysis and Design of Algorithms BCS401

Example:
Consider the same problem of Prim’s and let us apply Kruskal’s algorithm by arranging the edges of
given graph in non-decreasing order as follows:

Dept., of CSE Page 23


Analysis and Design of Algorithms BCS401

Analysis:
• Applying Prim’s and Kruskal’s algorithms to the same small graph by hand may create
the impression that the latter is simpler than the former. This impression is wrong
because, on each of its iterations, Kruskal’s algorithm has to check whether the
addition of the next edge to the edges already selected would create a cycle.
• A new cycle is created if and only if the new edge connects two vertices already
connected by a path, i.e., if and only if the two vertices belong to the same connected
component. There are efficient algorithms for doing so, including the crucial check for
whether two vertices belong to the same tree. They are called union-find algorithms.
• With an efficient union-find algorithm, the running time of Kruskal’s algorithm will be
dominated by the time needed for sorting the edge weights of a given graph.
• Hence, with an efficient sorting algorithm, the time efficiency of Kruskal’s algorithm
will be in O(|E| log |E|).

Disjoint Subsets and Union-Find Algorithms


Kruskal’s algorithm is one of a number of applications that require a dynamic partition of some
n element set S into a collection of disjoint subsets S1, S2, . . . , Sk. After being initialized as a
collection of n one-element subsets, each containing a different element of S, the collection is
subjected to a sequence of intermixed union and find operations.

Thus, we are dealing here with an abstract data type of a collection of disjoint subsets of a finite
set with the following operations:
➢ makeset(x) creates a one-element set {x}. It is assumed that this operation can be
applied to each of the elements of set S only once.

➢ find(x) returns a subset containing x.


➢ union(x, y) constructs the union of the disjoint subsets Sx and Sy containing x and y,
respectively, and adds it to the collection to replace Sx and Sy, which are deleted from
it.
For example, let S = {1, 2, 3, 4, 5, 6}.
• Then makeset(i) creates the set {i} and applying this operation six times initializes the
structure to the collection of six singleton sets:
{1}, {2}, {3}, {4}, {5}, {6}.

Dept., of CSE Page 24


Analysis and Design of Algorithms BCS401

• Performing union(1, 4) and union(5, 2) yields {1, 4}, {5, 2}, {3}, {6}, and, if followed
by union(4, 5) and then by union(3, 6), we end up with the disjoint subsets {1, 4, 5, 2},
{3, 6}.
(Application of Union-find algorithms for Kruskal’s is discussed in class. Refer class work)

Dijkstra’s Algorithm
(1. Find the shortest path for the given input using Dijkstra’s algorithm. Consider the source
node as ‘g’ 08 M

2. Apply single source shortest path algorithm to the following graph. Assume a as source
vertex 08M

3. Apply Dijkstra’s algorithm to find the single source shortest path for given graph,
considering ‘s’ as source vertex. Illustrate each step. 10M

)
Single-source shortest-paths problem: For a given vertex called the source
in a weighted connected graph, find shortest paths to all its other vertices.

The single-source shortest-paths problem asks for a family of paths, each leading from the
source to a different vertex in the graph, though some paths may, of course, have edges in
common.
Applications are transportation planning and packet routing in communication networks,
including the Internet.

The best-known algorithm for the single-source shortest-paths problem, called


Dijkstra’s algorithm. This algorithm is applicable to undirected and directed

Dept., of CSE Page 25


Analysis and Design of Algorithms BCS401

graphs with nonnegative weights only. Since in most applications this condition is satisfied, the
limitation has not impaired the popularity of Dijkstra’s algorithm.

Working principle: Dijkstra’s algorithm finds the shortest paths to a graph’s vertices in order
of their distance from a given source. First, it finds the shortest path from the source to a vertex
nearest to it, then to a second nearest, and so on. In general, before its ith iteration commences,
the algorithm has already identified the shortest paths to i − 1 other vertices nearest to the source.
The algorithm is:

Example:
Let us apply Dijkstra’s algorithm to find single-source shortest path for the graph given
below:

Dept., of CSE Page 26


Analysis and Design of Algorithms BCS401

Analysis:
The time efficiency of Dijkstra’s algorithm depends on the data structures used for
implementing the priority queue and for representing an input graph itself (Just like Prim’s
algorithm).
➢ It is in Ɵ(|V|2) for graphs represented by their weight matrix and the priority queue
implemented as an unordered array.
➢ For graphs represented by their adjacency lists and the priority queue implemented as
a min-heap, it is in O(|E| log |V |).

Dept., of CSE Page 27


Analysis and Design of Algorithms BCS401

Huffman Trees and Codes


(1. Explain Huffman Coding Concept 06M
(2. Construct the Huffman tree and resulting code word for the following set of values 08M

(3.
3. Construct the Huffman tree and resulting code word for the following 10M

4. What are Huffman Trees? Construct the Huffman tree for the following data. 10 M

Suppose we have to encode a text that comprises symbols from some n- symbol
alphabet by assigning to each of the text’s symbols some sequence of bits called the
codeword.
• For example, we can use a fixed-length encoding that assigns to each symbol a bit
string of the same length m (m ≥ log2 n). This is exactly what the standard ASCII code
does. One way of getting a coding scheme that yields a shorter bit string on the average
is based on the old idea of assigning shorter codewords to more frequent symbols and
longer codewords to less frequent symbols.
• Variable-length encoding, which assigns codewords of different lengths to different
symbols, introduces a problem that fixed-length encoding does not have. Namely, how
can we tell how many bits of an encoded text represent the first (or, more generally, the
ith) symbol? To avoid this complication, we can limit ourselves to the so-called prefix-
free (or simply prefix) codes.
• In a prefix code, no codeword is a prefix of a codeword of another symbol. Hence, with
such an encoding, we can simply scan a bit string until we get the first group of bits that
is a codeword for some symbol, replace these bits by this symbol, and repeat this
operation until the bit string’s end is reached.
• If we want to create a binary prefix code for some alphabet, it is natural to associate the
alphabet’s symbols with leaves of a binary tree in which all the left edges are labeled
by 0 and all the right edges are labeled by 1. The codeword of a symbol can then be
obtained by recording the labels on the simple path from the root to the symbol’s leaf.
One
Dept.,such algorithm is greedy algorithm, invented by David Huffman.
of CSE Page 28
Analysis and Design of Algorithms BCS401

Huffman’s algorithm

• Step 1: Initialize n, one-node trees and label them with the symbols of the alphabet
given. Record the frequency of each symbol in its tree’s root to indicate the tree’s
weight. (More generally, the weight of a tree will be equal to the sum of the frequencies
in the tree’s leaves.)
• Step 2: Repeat the following operation until a single tree is obtained. Find two trees
with the smallest weight (ties can be broken arbitrarily). Make them the left and right
subtree of a new tree and record the sum of their weights in the root of the new tree as
its weight.

A tree constructed by the above algorithm is called a Huffman tree. It defines— in the manner
described above—a Huffman code.

The Huffman tree construction for this input is shown below:

Dept., of CSE Page 29


Analysis and Design of Algorithms BCS401

Dept., of CSE Page 30


Hence, DAD is encoded as 011101, and 10011011011101 is
decoded as BAD_AD.

• With the occurrence frequencies given and the codeword lengths obtained,
the average number of bits per symbol in this code is,
2*0.35 + 3* 0.1+ 2*0.2 + 2* 0.2 + 3*0.15 = 2.25.

• Had we used a fixed-length encoding for the same alphabet, we would have
to use at least 3 bits per each symbol. Thus, for this toy example, Huffman’s
code achieves the compression ratio=25%.
• In other words, Huffman’s encoding of the text will use 25% less memory
than its fixed-length encoding.

Example 2:

1. Compare dynamic programming and greedy techniques 04M


2. Differentiate between Prim’s and Kruskal’s algorithm 04M

Dept., of CSE Page 31

You might also like