Unit 3
Unit 3
Example [Knapsack] :
The solution to the knapsack problem can be viewed as the result of a sequence of
decisions.
An optimal sequence of decisions maximizes the objective function Σ p ixi and also
satisfies the constraints Σ wixi ≤ m and 0 ≤ xi ≤ 1.
First, decide which vertex should be the second vertex, which the third, which the
fourth, and so on, until vertex j is reached.
Principle of optimality :
This principle states that an optimal sequence of decisions has the property that
whatever the initial state and decision are, the remaining decisions must constitute
an optimal decision sequence with regard to the state resulting from the first
decision.
The essential difference between the greedy method and dynamic programming is
that :
- In the greedy method only one decision sequence is ever generated.
- In dynamic programming, many decision sequences may be generated.
The use of these tabulated values makes it natural to recast the recursive equations
into an iterative algorithm.
Multi-Stage Graphs
A multistage graph G = (V, E) is a directed graph, with the vertices partitioned into k
> 2 disjoint sets Vi, 1 ≤ i ≤ k.
In this graph:
- If <u, v> is an edge in E, then u Vi and v Vi+1 for some i, 1 ≤ i < k.
- The sets V1 and Vk are such that │V1│ = │Vk│ = 1.
- Let s and t, respectively, be the vertices in V1 and Vk, then the vertex s is the
source and t the sink.
- Let c(i, j) be the cost of the edge <i, j>.
- The cost of a path from s to t is the sum of the costs of the edges on the path.
Let j, 0 ≤ j ≤ n units of the resource are allocated to project i and the resulting net
profit is N(i, j).
Now, the problem is to allocate the resource to the r projects such that the total net
profit is maximized.
In addition, G has edges of the type <V(r, j), V(r+1, n)> and each such edge is
assigned a weight of max0 ≤ p ≤ n-j {N(r, p)}.
The ith decision involves determining a vertex Vi+1, 1 ≤ i ≤ k-2, is to be on the path.
Hence, the principal of optimality holds.
Since, cost(k-1, j) = c(j, t) if <j, t> E and cost(k-1, j) = ∞ if <j, t> E, the above
equation may be solved for cost(1, s) by
- first computing cost(k-2, j) for all j Vk-2
- then computing cost(k-3, j) for all j Vk-3 and so on
- finally compute cost(1, s)
In order to write an algorithm for a general k-stage graph, the ordering on n vertices
need to be decided.
Indices are assigned in order of stages:
- First, s is assigned index 1.
- Then vertices in V2 are assigned indices.
- Then vertices from V3 and so on.
- Finally vertex t has index n.
Since bcost(2, j) = c(1, j) if <1, j> E and bcost(2, j) = ∞ if <1, j> E, the above
equation can be used to compute bcost(i, j) by
- first computing for i = 3,
- then for i = 4 and so on.
It may appear that the amount of work done won’t change if you change the
parenthesization of the expression, but can be proved that it is not the case!
Let us consider the following example:
- Let A be a 2x10 matrix
- Let B be a 10x50 matrix
- Let C be a 50x20 matrix
So our problem is :
Given a chain of matrices to multiply, determine the fewest number of
multiplications necessary to compute the product.
The key to solve this problem is noticing the sub-problem optimality condition:
– If a particular parenthesization of the whole product is optimal, then any sub-
parenthesization in that product is optimal as well.
Means :
– If (A (B ((CD) (EF)) ) ) is optimal
– Then (B ((CD) (EF)) ) is optimal as well
Assume that we are calculating ABCDEF and that the following parenthesization is
optimal:
• (A(B((CD) (EF))))
Then it is necessarily the case that
• (B((CD) (EF))) is the optimal parenthesization of BCDEF.
Why is this?
– Because if it wasn't, and say ( ((BC) (DE)) F) was better, then it would also follow
that (A ( ((BC) (DE)) F) ) is better than (A (B ((CD) (EF)) ) ),
– contradicting its optimality!
Our final multiplication will always be of the form
(A0 · A1 · ... Ak) · (Ak+1 · Ak+2 · ... · An-1)
In essence, there is exactly one value of k for which we should "split" our work into
two separate cases so that we get an optimal result.
Here is a list of the cases to choose from:
(A0) · (A1 · … · Ak+2 · ... · An-1)
(A0 · A1) · (A2 · … · Ak+2 · ... · An-1)
(A0 · A1 · A2) · (A3 · … · Ak+2 · ... · An-1)
·
·
·
(A0 · A1 · ... · An-3) · (An-2 · An-1)
(A0 · A1 · ... · An-2) · (An-1)
Basically, count the number of multiplications in each of these choices and pick the
minimum.
Now let’s turn this recursive formula into a dynamic programming solution
Which sub-problems are necessary to solve first?
Clearly it's necessary to solve the smaller problems before the larger ones.
• In particular, there is a need to know Ni,i+1, the number of multiplications to
multiply any adjacent pair of matrices before we move onto larger tasks.
• Similarly, the next task to be solved is finding all the values of the form N i,i+2, then
Ni,i+3, etc.
Algorithm:
Initialize N[i][i] = 0, and all other entries in N to .
for i=1 to n-1 do the following
for j=0 to n-1-i do
for k=j to j+i-1
if (N[j][j+i-1] > N[j][k]+N[k+1][j+i-1]+djdk+1di+j)
N[j][j+i-1]= N[j][k]+N[k+1][j+i-1]+djdk+1di+j
The tree of Figure (a), in the worst case, requires four comparisons to find an
identifier.
The tree of Figure (b), requires only three comparisons in the worst case.
In the case of tree (a), it takes 1,2,2, 3, and 4 comparisons, respectively, to find the
identifiers for, do. while, int, and if.
In the case of tree (b), it takes 1,2,2, 3, and 3 comparisons, respectively, to find the
identifiers for, do. while, int, and if.
On the average the two trees need 12/5 and 11/5 comparisons, respectively.
In a general situation, different identifiers are to be searched for with different
frequencies (or probabilities).
In addition, there will be unsuccessful searches also.
Let us assume that the given set of identifiers is {a 1, a2,... ,an} such that
a1 < a2 < … < an.
Let p(i) be the probability with which an identifier a i is searched.
Let q(i) be the probability with which an identifier x being searched, a i < x < ai+1,
0 ≤ i ≤ n. Let us assume that a0 = - and an + 1 = +.
So,
Σ1 ≤ i ≤ n p(i) is the probability of a successful search.
Σ0 ≤ i ≤ n q(i) is the probability of an unsuccessful search.
Clearly,
Σ1 ≤ i ≤ n p(i) + Σ0 ≤ i ≤ n q(i) = 1.
Based on the above data, an optimal binary search tree has to constructed for
{a1, a2, ..., an}.
To obtain a cost function for binary search trees, it is required to add external nodes,
as shown below.
For a binary search tree representing n identifiers, there will be exactly n internal
nodes and n + 1 external nodes.
Algorithm Search(x)
{
found := false;
t := tree;
while ((t ≠ 0) and not found) do
{
if (x = (t data)) then found := true;
else if (x < (t data)) then t := (t lchild);
else t := (t rchild);
}
if (not found) then return 0;
else return t;
}
Hence, the expected cost contribution from the internal node for a i is
p(i) * level (ai).
Unsuccessful searches terminate with t = 0 at external nodes in the above
algorithm.
The identifiers not in the binary search tree can be partitioned into n + 1
equivalence classes Ei, 0 ≤ i ≤ n.
The class E0 contains all identifiers x such that x < a1.
The class Ei contains all identifiers x such that ai < x < ai + 1, 1 i < n.
The class En contains all identifiers x, x > an.
For all identifiers in the same class Ei, the search terminates at the same external
node.
If the failure node for Ei is at level l, then only l — 1 iterations of the while loop are
made.
Hence, the cost contribution of this node is q(i) * (level(E i) — 1).
Now, the expected cost of a binary search tree is given by :
Our aim to construct an optimal binary search tree for the identifier set
{a1, a2,... , an} for which (1) is minimum.
Example :
The possible binary search trees for the identifier set {a 1, a2, a3} = {do, if, while}
With equal probabilities p(i) = q(i) = 1/7 for all i. The costs are :
cost(tree a) = 6/7 + 9/7 = 15/7
cost(tree b) = 5/7 + 8/7 = 13/7
cost(tree c) = 6/7 + 9/7 = 15/7
cost(tree d) = 6/7 + 9/7 = 15/7
cost(tree e) = 6/7 + 9/7 = 15/7
The tree b is optimal.
Apply dynamic programming technique now for this problem to obtain an optimal
binary search tree, as the result of a sequence of decisions.
The principle of optimality should hold when applied to the problem state resulting
from a decision.
First, a decision has to made about which of the a i's should be assigned as the root
node of the tree. Let it be ak.
Now, the left subtree l of the root node contains :
- a1, a2, …, ak – 1 as internal nodes and
- External nodes from the classes E0, E1, …, Ek – 1.
The right subtree r of the root node contains :
- ak + 1, ak + 2, …, an as internal nodes and
- External nodes from the classes Ek, Ek + 1, …, En.
Let us define
and
In both cases, the level is measured with regard to the root, at level 1.
let us use w(i, j) to represent the sum
Hence, cost(l) and cost(r) must be minimum over all binary search trees.
Let us use c(i, j) to represent the cost of an optimal binary search tree t ij containing
ai+1, ..., aj and Ei, ..., Ej.
--- (3)
The equation (3) can be generalized for c(i, j) as
} --- (4)
During the above computation, the root r(i, j) is recorded for each tree t i, j.
After that an optimal binary search tree is constructed from these r(i, j).
This r(i, j) is the value of k that minimizes (4).
Example :
Let n = 4 and (a1, a2, a3, a4) = (do, if, int, while).
Let p(1 : 4) = (3, 3, 1, 1) and q(0 : 4) = (2, 3, 1, 1, 1).
The p's and q's have been multiplied by 16 for convenience.
Solution :
Initially, w(i, i) = q(i), c(i, i) = 0 and r(i, i) = 0, 0 ≤ i ≤ 4.
Using Equation (4) and the observation w(i,j) = P(j) + q(j) + w(i, j - 1), the following
are obtained.
The all-pairs shortest-path problem is to determine a matrix A such that A(i, j) is the
length of a shortest path from i to j.
The matrix A can be obtained by solving n single-source problems using the
algorithm ShortestPaths.
Since each application of this procedure requires O(n2) time, the matrix A can be
obtained in O(n3) time.
In this solution, instead of requiring cost(i,j) ≥ 0, for every edge (i,j), it requires that
G have no cycles with negative length.
If G is allowed to contain a cycle of negative length, then the shortest path between
any two vertices on this cycle has length -.
Let us examine a shortest i to j path in G, i ≠ j.
This path originates at vertex i and goes through some intermediate vertices and
terminates at vertex j.
Let us assume that this path contains no cycles, if there is a cycle, then this can be
deleted without increasing the path length (no cycle has negative length).
If k is an intermediate vertex on this shortest path, then the subpaths from i to k
and from k to j must be shortest paths from i to k and k to j, respectively.
If k is the intermediate vertex with highest index, then the i to k path is a shortest i
to k path in G going through no vertex with index greater than k - 1.
Similarly the k to j path is a shortest k to j path in G going through no vertex of
index greater than k - 1.
We can obtain a recurrence for Ak(i,j) using an argument similar to that used before.
A shortest path from i to j going through no vertex higher than k either goes through
vertex k or it does not.
If it does, Ak(i,j) = Ak - 1(i, k) + Ak - 1(k, j). If it does not, then no intermediate vertex
has index greater than k—1.
The above equation is not true for graphs with cycles of negative length. The
following example illustrates this.
Example :
The following figure shows a digraph together with its matrix A 0.
Recurrence (2) can be solved for An by first computing A1, then A2, then A3, and so
on.
Since there is no vertex in G with index greater than n, A(i,j) = A n(i,j).
Example :
In the following figure, the graph (a) has the cost matrix of (b). The initial
A matrix, A0, and its values after three iterations A1, A2, and A3 are as
shown below
The Traveling Salesperson
Problem
This problem is a permutation problem.
The permutation problems usually are much harder to solve than subset problems
because:
- The permutation problems have n! different permutations of n objects.
- The subset problems have only 2n different subsets of n objects (n! > 2n).
Application 1 :
A postal van has to be routed to pick up mail from mail boxes located at n
different sites.
For this situation, an n + 1 vertex graph can be used.
In this, one vertex represents the post office from which the postal van
starts and returns. The remaining n vertices represent n mail boxes.
Edge <i, j> is assigned a cost equal to the distance from site i to site j.
The route taken by the postal van is a tour, and it should be of minimum
length.
Application 2 :
A robot arm has to tighten the nuts on some piece of machinery on an
assembly line.
The arm will start from its initial position, successively move to each of the
nuts, and return to the initial position.
The path of the arm is clearly a tour on a graph in which vertices
represent the nuts.
A minimum-cost tour will minimize the time needed for the arm to
complete its task.
Application 3 :
Let us consider a production environment in which several commodities are
manufactured on the same set of machines. The manufacture proceeds in cycles.
In each production cycle, n different commodities are produced.
When the machines are changed from production of commodity i to commodity j, a
change over cost cij is incurred.
It is desired to find a sequence in which to manufacture these commodities with the
least change over costs in a cycle.
So, this problem can be viewed as a traveling salesperson problem on an n vertex
graph with edge cost cij 's being the changeover cost.
Let us assume that a tour be a simple path that starts and ends at vertex 1.
Every tour consists of an edge <1, k> for some k V - {1} and a path from vertex k
to vertex 1.
So, the path from vertex k to vertex 1 goes through each vertex in V - {1, k} exactly
once.
For the optimal tour, the path from k to 1 must be a shortest k to 1 path going
through all vertices in V - {1,k}. So, the principle of optimality holds.
Let g(i,S) be the length of a shortest path starting at vertex i, going through all
vertices in S, and terminating at vertex 1.
So, the function g(1, V - {1}) is the length of an optimal salesperson tour.
From the principal of optimality it follows that
Equation (1) can be solved for g(1, V - {1}), if g(k, V - {1, k}) is known for all
values of k.
The g values can be obtained from equation (2).
So, g(i, Ф) = ci, 1, 1 < i ≤ n.
Thus,
g(2,Ф) = c2, 1 = 5,
g(3,Ф) = c3, 1 = 6, and
g(4,Ф) = c4, 1 = 8.
Using equation(2), compute g(i, S) for |S| = 1, i ≠ 1, 1 S, and i S.
g(2, {3}) = c2, 3 + g(3, Ф) = 9 + 6 = 15
g(2, {4}) = c2, 4 + g(4, Ф) = 10 + 8 = 18
g(3, {2}) = c3, 2 + g(2, Ф) = 13 + 5 = 18
g(3, {4}) = c3, 4 + g(4, Ф) = 12 + 8 = 20
g(4, {2}) = c4, 2 + g(2, Ф) = 8 + 5 = 13
g(4, {3}) = c4, 3 + g(3, Ф) = 9 + 6 = 15
g(3, {2, 4}) = min {c3, 2 + g(2, {4}), c3, 4 + g(4, {2})}
= min {13 + 18, 12 + 13} = min {41, 25} = 25
g(4, {2, 3}) = min {c4, 2 + g(2, {3}), c4, 3 + g(3, {2})}
= min {8 + 15, 9 + 18} = min {23, 27} = 23
Let N be the number of g(i, ,S)'s that have to be computed before equation (1) can
be used to compute g(1, V — {1}).
For each value of |S| there are n — 1 choices for i.
The number of distinct sets S of size k not including 1 and i is .
So,
Let the device Di has a reliability as ri (that is, ri is the probability that device Di will
function properly).
Then, the reliability of the entire system is given by ri.
Even though, the individual devices are very reliable, i.e., the r i's are very close to
one), the reliability of the entire system may not be very good.
For example, if n = 10 and ri = .99, 1 ≤ i ≤ 10, then ri = .904.
So, it is desirable to have multiple copies of the same device type connected in
parallel through the use of switching circuits as shown below.
The switching circuits determine which devices in any given group are functioning
properly and one such device at each stage is used.
Since, the switching circuits are not fully reliable, practically, the stage reliability is
little less than 1 - (1 — ri)mi.
mi ≥ 1 and integer.
Each ci > 0 and each mi should be in the range 1 ≤ mi ≤ ui, where the upper bound
ui is given by
There is at most one tuple for each different x that results from a sequence of
decisions on m1, m2, …, mn.
The dominance rule (f1, x1) dominates (f2, x2) iff f1 ≥ f2 and x1 ≤ x2 holds for this
problem.
Hence, the dominated tuples can be discarded from Si.
Example :
Design a three stage system with device types D1, D2 and D3 with the costs 30, 15,
20 and the reliabilities .9, .8, .5 respectively.
The cost of the entire system can not be more than 105.
Solution :
The stage i is having mi devices of type Di in parallel
Фi(mi) = 1 - (1 — ri)mi.
c1 = 30 c2 = 15 c3 = 20 c = 105
r1 = .9 r2 = .8 r3 = .5
u1 = 2 u2 = 3 u3 = 3
Si is used to denote set of undominated tuples (f, x) resulting from the decision
sequence for m1, m2, …, mi.
So, f(x) = fi(x).
Begin with S0 = {(1, 0)} and obtain each Si from Si – 1 by trying out all possible values
for mi and combining the resulting tuples together.
Sij is used to represent all tuples obtainable from Si – 1 by choosing mi = j.
Now
S11 = {(.9, 30)} and S12 = {(.99, 60)}
So, S1 = {(.9, 30), (.99, 60)}
S21 = {(.72,45), (.792, 75)} and S22 = {(.864, 60)}.
The tuple (.9504, 90) is eliminated because it leaves only the cost 15 not sufficient
for D3.
S23 = {(.8928,75)}.
So, S2 = {(.72,45), (.864, 60), (.8928, 75)}