0% found this document useful (0 votes)
65 views8 pages

Lecture7 PDF

This document summarizes a lecture on dynamic programming and optimal binary search trees. It begins by defining dynamic programming and its typical requirements and steps. It then discusses using dynamic programming to compute the number of possible binary search trees for N keys, defining the subproblems and deriving a recurrence relation. Finally, it introduces the problem of finding an optimal binary search tree given search probabilities for each key.

Uploaded by

Sanskar bomb
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
65 views8 pages

Lecture7 PDF

This document summarizes a lecture on dynamic programming and optimal binary search trees. It begins by defining dynamic programming and its typical requirements and steps. It then discusses using dynamic programming to compute the number of possible binary search trees for N keys, defining the subproblems and deriving a recurrence relation. Finally, it introduces the problem of finding an optimal binary search tree given search probabilities for each key.

Uploaded by

Sanskar bomb
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 8

15-750: Graduate Algorithms February 1, 2016

Lecture 7: Dynamic Programming I: Optimal BSTs

Lecturer: David Witmer Scribes: Ellango Jothimurugesan, Ziqiang Feng

1 Overview
The basic idea of dynamic programming (DP) is to solve a problem by breaking it up into smaller
subproblems and reusing solutions to subproblems.

1.1 Requirements
In generally, there are several requirements to apply dynamic programming to a problem [Algorithm
Design, Kleinberg and Tardos]:

1. Solution to the original problem can be computed from solutions to (independent) subprob-
lems.

2. There are polynomial number of subproblems.

3. There is an ordering of the subproblems such that the solution to a subproblem depends only
on solutions to subproblems that precede it in this order.

1.2 Dynamic Programming Steps


1. Define subproblems. This is the critical step. Usually the recurrence structure follows
naturally after defining the subproblems.

2. Recurrence. Write solution to (sub)problem in forms of solutions to smaller subproblems,


e., recursion. This will give the algorithm.

3. Correctness. Prove the recurrence is correct, usually by induction.

4. Complexity. Analyze the runtime complexity. Usually:

runtime = #subproblems × timeT oSolveEachOne

But sometimes there are more clever ways.

2 Computing the Number of Binary Search Trees


Suppose we have N keys. Without loss of generality, let’s assume the keys are integers {1, 2, 3, . . . , N }.
We ask the question: How many different binary search trees (BST) can we construct for these keys?
Note that we need to maintain the property of a BST. That is, the key of a node is greater than
all keys in its left sub-tree, and ditto for the right sub-tree.

1
2.1 Define subproblems
Definition 2.1.
B(n) = number of BSTs of n-nodes

Examples 2.2. B(n) for small n’s.


B(0) = 1 (the empty tree).
B(1) = 1 (a single node).
B(2) = 2

1 2

2 1

B(3) = 5

3 3 2 1 1

2 1 1 3 3 2

1 2 2 3

2.2 Recurrence
Definition 2.3.
n−1
X
f (n) = f (i) · f (n − i − 1)
i=0

f (0) = 1, f (1) = 1, f (2) = 2

Note:
1. Since the function is defined in recursive form, it is necessary to give base case values (e.g.,
f (0)).
2. The base cases given here are also the values of B(0), B(1), B(2).

2.3 Correctness
Claim 2.4.
B(n) = f (n)

Proof. (by induction)

Base case: It is obvious as B(0) = f (0) = 1, B(1) = f (1) = 1, B(2) = f (2) = 2.

2
Inductive case: Assume B(n) = f (n) holds for all n < m. We will show B(m) = f (m).
Let
Sm = {m-node BSTs}
i
Sm = {m-node BSTs with i nodes in its left subtree}
We can divides Sm into disjoint cases where it has a different number of nodes in its left subtree:
m−1
X
i
B(m) = |Sm | = |Sm |
i=0

Note that because we are counting the number in the left sub-tree, excluding the root, the limit
of sum goes only up to m − 1. Also note that we must maintain the fundamental property of a
BST. So fixing the i means fixing the root node, and the set of nodes that goes to its left or right.
i |. An m-node BST having a i-node left sub-tree means it has a (m−1−i)-
Next let’s consider |Sm
node right sub-tree.
r

Construct (m − 𝑖 − 1)-
node tree
Construct 𝑖-node tree 𝑖 independently
𝑚 −𝑖−1

Observation: If we keep the right sub-tree unchanged, and change the left-subtree (for the same
i nodes), we will get a different valid m-node tree. Ditto for the right sub-tree. For example, if we
have 2 possible different left sub-tree and 3 possible different right sub-tree, we can have 2 × 3 = 6
different trees.

i
=⇒ |Sm | = B(i) · B(m − i − 1)
m−1
X m−1
X
i
=⇒ B(m) = |Sm |= B(i) · B(m − i − 1)
i=0 i=0
m−1
X
=⇒ B(m) = f (i) · f (m − i − 1) = f (m) from inductive hypothesis.
i=0

The Algorithm follows immediately from the recurrence structure.

Algorithm 1 Compute B(n)


Input: n ∈ N
Output: B(n)
if n = 0 or n = 1 then
return 1
else if n = 2 then
return 2
else
return m−1
P
i=0 B(n) · B(n − i − 1)
end if

3
2.4 Runtime analysis
Define T (n) to be the runtime of B(n).

Naive analysis. In the naive implementation of the algorithm, we can analyze the runtime with
the recurrence tree.

n−1
X
T (n) = T (i) + n
i=0
≥ T (n − 1) + T (n − 2)
≥ F (n) F (n) is the Fibonacci sequence
= 2Θ(n)

(Better) Reusing solutions to subproblems. Observe in the recurrence tree below, there are
recurrent calls to the same subproblems, for example, B(n − 2). In the naive implementation, this
solution will be computed repeatedly many times. We can improve the runtime by computing it
only once, and reuse it for further calls.

𝐵(𝑛) Redundant computation of


the same subproblem many
times.
𝐵(0) 𝐵(1) … 𝐵(𝑛 − 2) 𝐵(𝑛 − 1)

𝐵(0) 𝐵(1) … 𝐵(𝑛 − 3) 𝐵(𝑛 − 2)

Formally, we note there exist dependencies among the subproblems. B(n) depends on B(n −
1), . . . , B(1), B(0). And B(2) depends on B(1), B(0), etc. The dependency graph is typically a
DAG (directed acyclic graph).

𝐵(0) 𝐵(1) 𝐵(2) … 𝐵(𝑛 − 1) 𝐵(𝑛)

From the DAG, we can specify an order to compute the solutions to sub-problems. In this case,
we compute in the order of B(0), B(1), B(2), · · · , B(n).

Runtime Let Ti be the cost the compute B(i) given B(0), B(1), . . . , B(i − 1), aka. all solutions
to its dependent subproblems.

Ti = O(i) (multiplications and additions)


n
X n
X
∴ T (n) = Ti = O(i) = O(n2 )
i=1 i=1

4
Remark 2.5. B(n) is also called the Catalan number. It can be approximated by
4n
B(n) ≈ 3/2

n π
See https://en.wikipedia.org/wiki/Catalan_number for more details.

3 Optimal Binary Search Trees


Suppose we are given a list of keys k1 < k2 < . . . < kn , and a list of probabilities pi that each key
will be looked up. An optimal binary search tree is a BST T that minimizes the expected search
time
Xn
pi (depthT (ki ) + 1).
i=1
where the depth of the root is 0. We will assume WLOG that the keys are the numbers 1, 2, . . . , n.

Example 3.1. Input:


ki 1 2 3
pi 0.3 0.1 0.6
For n = 3, there are 5 possible BSTs. The following figure enumerates them and their corre-
sponding expected search times. The optimal BST for the given input is the second tree.

3 3 2 1 1

2 1 1 3 3 2

1 2 2 3

1.7 1.5 1.9 1.8 2.3

5
In general, there are Θ(4n n−3/2 ) binary trees, so we cannot enumerate them all. By using
dynamic programming, however, we can solve the problem efficiently.
1. Define subproblems
As we will commonly do with dynamic programming, let us first focus on computing the
numeric value of the expected search time for an optimal BST, and then we will consider how
to modify our solution in order to find the corresponding BST.
Let 1 ≤ i ≤ j ≤ n, and T be any BST on i, . . . , j. We will define the cost of T
j
X
C(T ) = pl (depthT (l) + 1)
l=i
and the subproblems
Cij = min C(T ).
T on i,...,j

The expected search time for the optimal BST is C1n .


2. Recurrence relation
Suppose the root of T on i, . . . , j is k.

TL TR

i k-1 k+1 j

The cost of T is
j
X
C(T ) = pl (depthT (l) + 1)
l=i
k−1
X j
X
= pl (depthTL (l) + 1 + 1) + pk + pl (depthTR (l) + 1 + 1)
l=i l=k+1
j
X
= C(TL ) + C(TR ) + pl .
l=i

6
0
And so we define the recurrence Cij

0 0
Pj
mini≤k≤j {Ci,k−1 + Ck+1,j } + l=i pl
 if i < j
0
Cij = pi if i = j

0 if i > j

3. Correctness
0 =C .
Claim 3.2. Cij ij

Proof. The proof is by induction on j − i. The base case is trivial.


Cij ≤ Cij0 : Per the previous calculation, C 0 is the cost of some BST on i, . . . , j and C is the
ij ij
cost of an optimal BST.
0 : Suppose the root of the optimal BST is k. Then,
Cij ≥ Cij
j
X
Cij = Ci,k−1 + Ck+1,j + pl
l=i
Xj
0 0
≥ Ci,k−1 + Ck+1,j + pl
l=i
j
X
0 0
≥ min {Ci,k−1 + Ck+1,j }+ pl
i≤k≤j
l=i
0
= Cij .

From the recurrence, we have the following memoized algorithm.

Algorithm 2 Expected Search Time of Optimal BST


function C(i,j)
if C(i, j) already computed then
return C(i, j)
end if
if i > j then
return 0
else if i = j then
return pi
else
return mini≤k≤j {C(i, k − 1) + C(k + 1, j)} + jl=i pl
P
end if
end function

In order to compute the actual BST, for each subproblem we can also store the root of the
corresponding subtree
r(i, j) = argmin{C(i, k − 1) + C(k + 1, j)}.
i≤k≤j

7
4. Runtime There are a total of n2 subproblems, and each subproblem takes O(n) time to
compute, assuming all its subproblems are already solved. Thus, the total running time is
O(n3 ).
But, we can actually do better than this.

Theorem 3.3 (Knuth, 1971).

r(i, j − 1) ≤ r(i, j) ≤ r(i + 1, j).

By the theorem, for each subproblem we can restrict the possible values of k to r(i + 1, j) −
r(i, j − 1) + 1 choices instead of trying all i ≤ k ≤ j. For this case, the total runtime is
n X
X n n X
X n n X
X n
r(i + 1, j) − r(i, j − 1) + 1 = n2 + r(i + 1, j) − r(i, j − 1)
i=1 j=1 i=1 j=1 i=1 j=1
n+1
XX n n n−1
X X
= n2 + r(i, j) − r(i, j)
i=2 j=1 i=1 j=0
n+1
XX n n n−1
X X
≤ n2 + r(i, j) − r(i, j)
i=1 j=1 i=1 j=1
 
n X
X n n
X
= n2 +  r(i, j) + r(n + 1, j)
i=1 j=1 j=1
 
Xn X
n n
X
− r(i, j) − r(i, n)
i=1 j=1 i=1
n
X n
X
= n2 + r(n + 1, j) + r(i, n)
j=1 i=1

= O(n2 )

where we used the fact that r(−, −) ≤ n.

You might also like