0% found this document useful (0 votes)
2 views

Optimal Binary Search Tree1

Dynamic Programming (DP) is a technique for solving problems with overlapping subproblems and optimal substructure, developed by Richard Bellman in the 1950s. It involves breaking down problems into simpler subproblems, solving them just once, and storing their solutions for future reference. Examples include the Fibonacci sequence and the 0/1 knapsack problem, where optimal solutions to larger problems are constructed from optimal solutions to smaller subproblems.

Uploaded by

introvertb46
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
2 views

Optimal Binary Search Tree1

Dynamic Programming (DP) is a technique for solving problems with overlapping subproblems and optimal substructure, developed by Richard Bellman in the 1950s. It involves breaking down problems into simpler subproblems, solving them just once, and storing their solutions for future reference. Examples include the Fibonacci sequence and the 0/1 knapsack problem, where optimal solutions to larger problems are constructed from optimal solutions to smaller subproblems.

Uploaded by

introvertb46
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 50

DYNAMIC PROGRAMMING

• Problems like knapsack problem, shortest


path can be solved by greedy method in
which optimal decisions can be made one at
a time.
• For many problems, it is not possible to
make stepwise decision in such a manner
that the sequence of decisions made is
optimal.
DP Idea
• Dynamic Programming is a general algorithm design
technique for solving problems defined by recurrences
with overlapping subproblems
• Invented by American mathematician Richard Bellman in
the 1950s to solve optimization problems and later
assimilated by CS
• “Programming” here means “planning”
• Main idea:
• set up a recurrence relating a solution to a larger instance to solutions
of some smaller instances
• - solve smaller instances once
• record solutions in a table
• extract solution to the initial instance from that table

V. Balasubramanian 2
Contd…
• The Cause of inefficiency in divide-and-
conquer
• After division …
– Smaller instances are unrelated, e.g., mergesort
– Smaller instances are related, e.g., fibonacci
• repeatedly solve common instances

V. Balasubramanian 3
Contd…
• Dynamic programming
– bottom-up approach
– use an array (table) to save solutions to small
instances

V. Balasubramanian 4
Fibonacci
• The same Fibonacci series algorithm in Dynamic programming is as follows:
• Dynamic programming Algorithm nth Fibonacci Term (Iterative)
– Problem: Determine the nth term in the Fibonacci sequence.
– Inputs: a nonnegative integer n.
– Outputs : fib2, the nth term in the Fibonacci sequence.
• int fib 2 (int n) {
• index i;
• int f[0 .. n]; // array to store Fibonacci values
• f[ 0 ] = 0;
• if (n > 0){
• f[ 1 ] = 1;
• for (i = 2; i<= n; i++)
• f[ i ] = f[i - 1] + f [i -2 ]; }
• return f[ n ];
• }

V. Balasubramanian 5
Binomial coefficent

⎛ n⎞ n!
⎜ ⎟= for 0 ≤ k ≤ n
⎝ k ⎠ k !(n − k )!

V. Balasubramanian 6
• Definition
The binomial coefficient
⎛ n⎞ n!
⎜ ⎟= for 0 ≤ k ≤ n
⎝ k ⎠ k !(n − k )!
• Recursive definition

⎧⎛ n − 1⎞ ⎛ n − 1⎞
⎛ n⎞ ⎪⎜ ⎟ +⎜ ⎟ 0< k <n
⎜ ⎟ = ⎨⎝ k − 1⎠ ⎝ k ⎠
⎝ k⎠ ⎪1 k = 0 or k = n

V. Balasubramanian 7
7 Algorithms ([email protected])
The algorithm
• Algorithm 3.1: Binomial Coefficient Using Divide-
and-Conquer
– Problem: Compute the binomial coefficient.
– Inputs: nonnegative integers n and k, where k ≤ n.
– Outputs: bin, the binomial coefficient .
int bin (int n, int k) {
if ( k = = 0 || n = = k)
return 1;
else
return bin (n-1, k - 1)+bin (n - 1, k);
} V. Balasubramanian 8
8
Using dynamic programming

• Using an array B to store coefficients


• Steps:
– Establish a recursive property:
⎧ B[i − 1][ j − 1] + B[i − 1][ j ] 0 < j < i
B[i ][ j ] = ⎨
⎩1 j = 0 or j = i

– Solve an instance of the problem in a bottom-up


fashion by computing the rows in B in sequence
starting with the first row

V. Balasubramanian 9
9
Compute sequence of rows

• 10Let’s compute B[4][2]


V. Balasubramanian 10
The algorithm
• Algorithm 3.2: Binomial Coefficient Using Dynamic
Programming
– Problem: Compute the binomial coefficient.
– Inputs: nonnegative integers n and k, where k ≤ n.
– Outputs: bin 2, the binomial coefficient
int bin2 (int n, int k) {
index i, j;
int B[0..n] [0..k];
for (i = 0; i <= n; i++)
for (j = 0; j <= minimum (i, k); j++)
if (j = = 0 || j = = i)
B[i][ j] = 1;
else
B[i][j] = B[i - 1][j - 1] + B[i - 1] [j];
return B[n][k];
}
V. Balasubramanian 11
11
DYNAMIC PROGRAMMING
(Contd..)
Example:
• Suppose a shortest path from vertex i to vertex j is
to be found.
• Let Ai be vertices adjacent from vertex i. which of
the vertices of Ai should be the second vertex on
the path?
• One way to solve the problem is to enumerate all
decision sequences and pick out the best.
• In dynamic programming the principle of
optimality is used to reduce the decision
sequences.
DYNAMIC PROGRAMMING
(Contd..)
Principle of optimality:
• An optimal sequence of decisions has the property
that whatever the initial state and decisions are, the
remaining decisions must constitute an optional
decision sequence with regard to the state
resulting from the first decision.
• In the greedy method only one decision sequence
is generated.
• In dynamic programming many decision
sequences may be generated.
V. Balasubramanian 13
Contd…
• An optimal solution to an instance of a
problem always contains optimal solutions to
all subinstances
• ensures that an optimal solution to an instance
can be obtained by combining optimal
solutions to subinstances
• It is necessary to show that the principle
applies before using dynamic programming to
obtain the solution
V. Balasubramanian 14
DYNAMIC PROGRAMMING
(Contd..)
Example:
• [0/1 knapsack problem]:the xi’s in 0/1 knapsack problem is
restricted to either 0 or 1.
• Using KNAP(l,j,y) to represent the problem
Maximize ∑pixi
l≤i≤j
subject to ∑wixi ≤ y, ………………..(1)
l≤i≤j
xi = 0 or 1 l≤i≤j

The 0/1 knapsack problem is KNAP(l,n,M).


DYNAMIC PROGRAMMING
(Contd..)
• Let y1 ,y2 , …… ,yn be an optimal sequence of 0/l values for
x1 ,x2 …..,xn respectively.
• If y1 = 0 then y2,y3,……,yn must constitute an optimal
sequence for KNAP (2,n,M).
• If it does not, then y1 ,y2 , …… ,yn is not an optimal
sequence for KNAP (1, n, M). If y1=1, then y1 ,y2 , …… ,yn
is an optimal sequence for KNAP(2,n,M-wi ).
• If it is not there is another 0/l sequence z1,z2,….,zn such that
∑pizi has greater value.
• Thus the principal of optimality holds.
DYNAMIC PROGRAMMING
(Contd..)
• Let gj(y) be the value of an optimal solution to
KNAP (j+1,n,y).
• Then g0(M) is the value of an optimal solution to
KNAP(l,n,M).
• From the principal of optimality
g0(M) = max{g1(M), g1(M-W1) + P1)}

• g0(M) is the maximum profit which is the value of


the optimal solution .
V. Balasubramanian 17
DYNAMIC PROGRAMMING
(Contd..)
• The principal of optimality can be equally applied
to intermediate states and decisions as has been
applied to initial states and decisions.
• Let y1,y2,…..yn be an optimal solution to
KNAP(l,n,M).
• Then for each j l≤j≤n , yi,..,yj and yj+1,….,yn must
be optimal solutions to the problems
KNAP(1,j,∑wiyi)
l≤i≤j
and KNAP(j+1,n,M-∑wiyi) respectively.
l≤i≤j
DYNAMIC PROGRAMMING
(Contd..)
• Then gi(y) = max{gi+1(y) (xi +1= 0 case),
gi+1(y-wi +1) + pi+1}…(1) (xi +1= 1case),
• Equation (1) is known as recurrence relation.
• Dynamic programming algorithms solve the relation
to obtain a solution to the given problems.
• To solve 0/1 knapsack problem , we use the
knowledge gn(y) = 0 for all y, because gn(y) is on
optimal solution (profit) to the problem
KNAP(n+1, n, y) which is obviously zero for any y.

V. Balasubramanian 19
DYNAMIC PROGRAMMING
(Contd..)
• Substituting i = n -1 and using gn(y)= 0 in the
above relation(1), we obtain gn-1(y).
• Then using gn-1(y), we obtain gn-2(y) and so on till
we get g0(M) (with i=0) which is the solution of
the knapsack problem.
• There are two approaches to solving the
recurrence relation 1
• (1) Forward approach and (2) Backward approach

V. Balasubramanian 20
DYNAMIC PROGRAMMING
(Contd..)
• In the forward approach ,decision xi is made
in terms of optimal decision Sequences for
xi+1……..xn (i.e we look ahead).
• In the backward approach, decision xi is in
terms of optimal decision sequences for
x1……xi-1(i.e we look backward).

V. Balasubramanian 21
DYNAMIC PROGRAMMING
(Contd..)
• For the 0/l knapsack problems
Gi(y)=max{gi+1(y),gi+1(y-wi+1)+Pi+1}………(1)
• Is the forward approach as gn-1(y) is obtained
using gn(y).
• fi(y) = max{fj-1(y), fj-1(y-wi) + pj} …………..(2)
• is the backward approach, fj(y) is the value of
optimal solution to Knap(i,j,Y). (2) may be solved
by beginning with
fi(y) = 0 for all y ≥ 0 and fi(y) = -infinity for
y < 0.
V. Balasubramanian 22
Example
• Consider 0/1 knapsack problem which has 3
objects n=3, their weights are w1=2, w2=3,
w3=4, their profits are p1=1, p2=2, p3=5
and knapsack capacity m=6. compute g0(6).

V. Balasubramanian 23
solution
g0(6) = max{g1(6), g1(6-W1) + P1)}
= max{g1(6), g1(6-2) + 1)}
g0(6) = max{g1(6), g1(4) + 1)}

g1(6) = max{g2(6), g2(6-W2) + P2)}


g1(6) = max{g2(6), g2(3) + 2)}

g2(6) = max{g3(6), g3(6-W3) + P3)}


g2(6) = max{0, g3(2) + 5)} = max{0, 5)} = 5.

g2(3) = max{g3(3), g3(3-W3) + P3)}


= max{0, g3(3-4) + 5)} = max{0, -infinity)} = 0

V. Balasubramanian 24
Contd…
g1(4) = max{g2(4), g2(4-W2) + P2)}
= max{g2(4), g2(4-3) + 2)}
= max{g2(4), g2(1) + 2)}

g2(4) = max{g3(4), g3(4-4) + 5)} = max{0, 5)} = 5


g1(4) = max{5, g2(1) + 2)}

V. Balasubramanian 25
Contd…
g2(1) = max{g3(1), g3(1-W3) + P3)}
g2(6) = max{0, g3(1-4) + 5)} = max{0, -infinity + 5)} = 0.

G1(4) = max{5, 0+2} = 5.

G0(6) = max { 5, 5 +1} = 6.

V. Balasubramanian 26
FEATURES OF DYNAMIC
PROGRAMMING SOLUTIONS

• It is easier to obtain the recurrence relation using


backward approach.
• Dynamic programming algorithms often have
polynomial complexity.
• Optimal solution to sub problems are retained so
as to avoid recomputing their values.

V. Balasubramanian 27
OPTIMAL BINARY SEARCH
TREES
• Definition: binary search tree (BST) A binary
search tree is a binary tree; either it is empty or
each node contains an identifier and
(i) all identifiers in the left sub tree of T are less
than the identifiers in the root node T.
(ii) all the identifiers the right sub tree are greater
than the identifier in the root node T.
(iii) the right and left sub tree are also BSTs.

V. Balasubramanian 28
Optimal Binary Search Trees
Problem: Given n keys a1 < …< an and probabilities p1 ≤ … ≤ pn
searching for them, find a BST with a minimum
average number of comparisons in successful search.
Since total number of BSTs with n nodes is given by
C(2n,n)/(n+1), which grows exponentially, brute force is hopeless.

Example: What is an optimal BST for keys A, B, C, and D with


search probabilities 0.1, 0.2, 0.4, and 0.3, respectively?

V. Balasubramanian 29
DP for Optimal BST Problem
Let C[i,j] be minimum average number of comparisons made in
T[i,j], optimal BST for keys ai < …< aj , where 1 ≤ i ≤ j ≤ n.
Consider optimal BST among all BSTs with some ak (i ≤ k ≤ j )
as their root; T[i,j] is the best among them.

ak C[i,j] =
min {pk · 1 +
i≤k≤j

k -1
∑ ps (level as in T[i,k-1] +1) +
Optimal Optimal s=i
BST for BST for
a i , ..., a k-1 a k+1 , ..., a j
j
∑ ps (level as in T[k+1,j] +1)}
s =k+1
V. Balasubramanian 30
Example: key A B C D
probability 0.1 0.2 0.4 0.3
The tables below are filled diagonal by diagonal: the left one is filled
using the recurrence j
C[i,j] = min {C[i,k-1] + C[k+1,j]} + ∑ ps , C[i,i] = pi ;
i≤k≤j s=i
the right one, for trees’ roots, records k’s values giving the minima
i
j
1 2 3 4 i
j
0 1 2 3 4
C
1 0 .1 .4 1. 1. 1 1 2 3 3
2 0 0 .2 1 71.
B D
2 2 3 3
3 0 .8 41. 3 3 3 A
4 .40 0.3 4 4 optimal BST
V. Balasubramanian 31
5 0 5
Optimal Binary Search Trees

V. Balasubramanian 32
Analysis DP for Optimal BST
Problem
Time efficiency: Θ(n3) but can be reduced to Θ(n2) by taking
advantage of monotonicity of entries in the
root table, i.e., R[i,j] is always in the range
between R[i,j-1] and R[i+1,j]
Space efficiency: Θ(n2)

Method can be expended to include unsuccessful searches

V. Balasubramanian 33
ALGORITHM TO SEARCH FOR AN
IDENTIFIER IN THE TREE ‘T’.

Procedure SEARCH (T X I)
// Search T for X, each node had fields LCHILD,
IDENT, RCHILD//
// Return address i pointing to the identifier X//
//Initially T is pointing to tree.
//ident(i)=X or i=0 //
iÅT
V. Balasubramanian 34
Algorithm to search for an identifier in the tree
‘T’(Contd..)

While i ≠ 0 do
case : X < Ident(i) : i ÅLCHILD(i)
: X = IDENT(i) : RETURN i
: X > IDENT(i) : I Å RCHILD(i)
endcase
repeat
end SEARCH
V. Balasubramanian 35
Optimal Binary Search trees -
Example
If

For while

repeat

loop
if each identifier is searched with equal probability the
average number of comparisons for the above tree are
1+2+2+3+4 = 12/5.
5
V. Balasubramanian 36
OPTIMAL BINARY SEARCH
TREES (Contd..)
• Let us assume that the given set of
identifiers are {a1,a2,……..an} with
a1<a2<…….<an.
• Let Pi be the probability with which we are
searching for ai.
• Let Qi be the probability that identifier x
being searched for is such that ai<x<ai+1
0≤i≤n, and a0=-∞ and an+1=+∞.

V. Balasubramanian 37
OPTIMAL BINARY SEARCH
TREES (Contd..)
• Then ∑Qi is the probability of an unsuccessful search.
0≤i≤ n
∑P(i) + ∑Q(i) = 1. Given the data,
1≤i≤n 0≤i≤n
let us construct one optimal binary search tree for
(a1……….an).
• In place of empty sub tree, we add external nodes
denoted with squares.
• Internal nodes are denoted as circles.
V. Balasubramanian 38
OPTIMAL BINARY SEARCH
TREES (Contd..)
If

For while

repeat

loop

V. Balasubramanian 39
Construction of optimal binary search
trees
• A BST with n identifiers will have n internal
nodes and n+1 external nodes.
• Successful search terminates at internal nodes
unsuccessful search terminates at external nodes.
• If a successful search terminates at an internal
node at level L, then L iterations of the loop in the
algorithm are needed.
• Hence the expected cost contribution from the
internal nodes for ai is P(i) * level(ai).

V. Balasubramanian 40
OPTIMAL BINARY SEARCH
TREES (Contd..)
• Unsuccessful searche terminates at external nodes
i.e. at i = 0.
• The identifiers not in the binary search tree may
be partitioned into n+1 equivalent classes
Ei 0≤i≤n.
Eo contains all X such that X≤ai
Ei contains all X such that a<X<=ai+1 1≤i≤n
En contains all X such that X > an
V. Balasubramanian 41
OPTIMAL BINARY SEARCH
TREES (Contd..)
• For identifiers in the same class Ei, the
search terminate at the same external node.
• If the failure node for Ei is at level L, then
only L-1 iterations of the while loop are
made
∴The cost contribution of the failure node
for Ei is Q(i) * level (Ei ) -1)

V. Balasubramanian 42
OPTIMAL BINARY SEARCH
TREES (Contd..)
• Thus the expected cost of a binary search tree is:
∑P(i) * level (ai) + ∑Q(i) * level(Ei) – 1) ……(2)
1≤i≤n 0≤i≤n
• An optimal binary search tree for {a1…….,an} is a
BST for which (2) is minimum .
• Example: Let {a1,a2, a3}={do, if, stop}

V. Balasubramanian 43
OPTIMAL BINARY SEARCH
TREES (Contd..)
Level 1 stop if do

Level 2 if Q(3) do stop if


Q(2)
Level 3 do stop

Q(0) Q(1)
(a) (b) (c)
{a1,a2,a3}={do,if,stop}

V. Balasubramanian 44
OPTIMAL BINARY SEARCH
TREES (Contd..)
stop do

do stop
if
if

(d) (c)
V. Balasubramanian 45
OPTIMAL BINARY SEARCH
TREES (Contd..)
• With equal probability P(i)=Q(i)=1/7.
• Let us find an OBST out of these.
• Cost(tree a)=∑P(i)*level a(i) +∑Q(i)*level (Ei) -1
1≤i≤n 0≤i≤n
(2-1) (3-1) (4-1) (4-1)
=1/7[1+2+3 + 1 + 2 + 3 + 3 ] = 15/7
• Cost (tree b) =17[1+2+2+2+2+2+2] =13/7
• Cost (tree c) =cost (tree d) =cost (tree e) =15/7
∴ tree b is optimal.
V. Balasubramanian 46
OPTIMAL BINARY SEARCH
TREES (Contd..)
• If P(1) =0.5 ,P(2) =0.1, P(3) =0.05 , Q(0)
=.15 , Q(1) =.1, Q(2) =.05 and Q(3) =.05
find the OBST.
• Cost (tree a) = .5 x 3 +.1 x 2 +.05 x 3
+.15x3 +.1x3 +.05x2 +.05x1 = 2.65
• Cost (tree b) =1.9 , Cost (tree c) =1.5 ,Cost
(tree d) =2.05 ,
• Cost (tree e) =1.6 Hence tree C is optimal.
V. Balasubramanian 47
OPTIMAL BINARY SEARCH
TREES (Contd..)
• To obtain a OBST using Dynamic programming
we need to take a sequence of decisions regard.
The construction of tree.
• First decision is which of ai is be as root.
• Let us choose ak as the root . Then the internal
nodes for a1,…….,ak-1 and the external nodes for
classes Eo,E1,……,Ek-1 will lie in the left subtree L
of the root.
• The remaining nodes will be in the right subtree R.

V. Balasubramanian 48
OPTIMAL BINARY SEARCH
TREES (Contd..)
Define
Cost(L) =∑P(i)* level(ai) + ∑Q(i)*(level(Ei )-1)
1≤i≤k 0≤i≤k
Cost(R) =∑P(i)*level(ai) + ∑Q(i)*(level(Ei )-1)
k≤i≤n k≤i≤n
• Tij be the tree with nodes ai+1,…..,aj and nodes
corresponding to Ei,Ei+1,…..,Ej.
• Let W(i,j) represents the weight of tree Tij.
V. Balasubramanian 49
OPTIMAL BINARY SEARCH
TREES (Contd..)
W(i,j)=P(i+1) +…+P(j)+Q(i)+Q(i+1)…Q(j)=Q(i) +∑j [Q(l)+P(l)]
l=i+1
• The expected cost of the search tree in (a) is (let us call it T) is
P(k)+cost(l)+cost(r)+W(0,k-1)+W(k,n)
W(0,k-1) is the sum of probabilities corresponding to nodes
and nodes belonging to equivalent classes to the left of ak.
W(k,n) is the sum of the probabilities corresponding to those
on the right of ak. ak

L R
(a) OBST with root ak
V. Balasubramanian 50

You might also like