0% found this document useful (0 votes)
2 views

Greedy Algorithms

Uploaded by

nitinolxuser
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
2 views

Greedy Algorithms

Uploaded by

nitinolxuser
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 1

Suppose we have a set of n files that we want to store on magnetic tape.

1 In the future, users will


want to read those files from the tape. Reading a file from tape isn’t like reading a file from disk; first
we have to fast-forward past all the other files, and that takes a significant amount of time. Let L[1 ..
n] be an array listing the lengths of each file; specifically, file i has length L[i]. If the files are stored in
order from 1 to n, then the cost of accessing the kth file is cost(k) = X k i=1 L[i]. 1Readers who are
tempted to object that magnetic tape has been obsolete for decades are cordially invited to tour
your nearest supercomputer facility; ask to see the tape robots. Alternatively, consider filing a
sequence of books on a library bookshelf. You know, those strange brick-like objects made of dead
trees and ink? 159 4. GREEDY ALGORITHMS The cost reflects the fact that before we read file k we
must first scan past all the earlier files on the tape. If we assume for the moment that each file is
equally likely to be accessed, then the expected cost of searching for a random file is E[cost] = Xn k=1
cost(k) n = 1 n Xn k=1 X k i=1 L[i]. If we change the order of the files on the tape, we change the cost
of accessing the files; some files become more expensive to read, but others become cheaper.
Different file orders are likely to result in different expected costs. Specifically, let π(i) denote the
index of the file stored at position i on the tape. Then the expected cost of the permutation π is
E[cost(π)] = 1 n Xn k=1 X k i=1 L[π(i)]. Which order should we use if we want this expected cost to be
as small as possible? The answer seems intuitively clear: Sort the files by increasing length. But
intuition is a tricky beast. The only way to be sure that this order works is to take off and nuke the
entire site from orbit actually prove that it works! Lemma 4.1. E[cost(π)] is minimized when L[π(i)] ≤
L[π(i + 1)] for all i. Proof: Suppose L[π(i)] > L[π(i + 1)] for some index i. To simplify notation, let a = π(i)
and b = π(i + 1). If we swap files a and b, then the cost of accessing a increases by L[b], and the cost
of accessing b decreases by L[a]. Overall, the swap changes the expected cost by (L[b]− L[a])/n. But
this change is an improvement, because L[b] < L[a]. Thus, if the files are out of order, we can
decrease the expected cost by swapping some mis-ordered pair of files. ƒ This is our first example of
a correct greedy algorithm. To minimize the total expected cost of accessing the files, we put the file
that is cheapest to access first, and then recursively write everything else; no backtracking, no
dynamic programming, just make the best local choice and blindly plow ahead. If we use an efficient
sorting algorithm, the running time is clearly O(n log n), plus the time required to actually write the
files. To show that the greedy algorithm is actually correct, we proved that the output of any other
algorithm can be improved by some sort of exchange Let’s generalize this idea further. Suppose we
are also given an array F[1 .. n] of access frequencies for each file; file i will be accessed exactly F[i]
times over the lifetime of the tape. Now the total cost of accessing all the files on the tape is Σcost(π)
= Xn k=1 ‚ F[π(k)] · X k i=1 L[π(i)]Œ = Xn k=1 X k i=1 F[π(k)] · L[π(i)] . 16 4.2. Scheduling Classes As
before, reordering the files can change this total cost. So what order should we use if we want the
total cost to be as small as possible? (This question is similar in spirit to the optimal binary search
tree problem, but the target data structure and the cost function are both different, so the algorithm
must be different, too.) We already proved that if all the frequencies are equal, we should sort the
files by increasing size. If the frequencies are all different but the file lengths L[i] are all equal, then
intuitively, we should sort the files by decreasing access frequency, with the most-accessed file first.
In fact, this is not hard to prove (hint, hint) by modifying the proof of Lemma 4.1. But what if the
sizes and the frequencies both vary? In this case, we should sort the files by the ratio L/F. Lemma 4.2.
Σcost(π) is minimized when L[π(i)] F[π(i)] ≤ L[π(i + 1)] F[π(i + 1)] for all i. Proof: Suppose L[π(i)]/F[π(i)]
> L[π(i + 1)]/F[π(i + i)] for some index i. To simplify notation, let a = π(i) and b = π(i+1). If we swap files
a and b, then the cost of accessing a increases by L[b], and the cost of accessing b decreases by L[a].
Overall, the swap changes the total cost by L[b]F[a] − L[a]F[b]. But this change is an improvement,
because L[a] F[a] > L[b] F[b] ⇐⇒ L[b]F[a] − L[a]F[b] < 0. Thus, if any two adjacent files are out of
order, we can improve the total cost by swapping them.

You might also like