Divide and Conquer (Merge Sort)
Divide and Conquer (Merge Sort)
Divide and
and Conquer
Conquer
(Merge
(Merge Sort)
Sort)
-2 Comp 122
Which are
more difficult?
3
-3
Sort
Obviously application
Sort a list of names.
Organize an MP3 library.
Display Google PageRank results.
List RSS news items in reverse chronological order.
Problems become easy once items are in sorted order
Find the median.
Find the closest pair.
Binary search in a database.
Identify statistical outliers.
Find duplicates in a mailing list.
4
-4
Non-obvious applications
Data compression.
Computer graphics.
Computational biology.
Supply chain management.
Book recommendations on Amazon.
Load balancing on a parallel computer.
....
5
-5
Divide-and-Conquer Examples
Sorting: mergesort and quicksort
-6
An Example: Merge Sort
Sorting Problem: Sort a sequence of n elements into
non-decreasing order.
-7 Comp 122
Mergesort
Split array A[1….n] into about equal halves and
make copies of each half in arrays B and C
Sort arrays B and C recursively
Merge sorted arrays B and C into array A as
follows:
Repeat the following until no elements remain in one of the
arrays:
• compare the first elements in the remaining unprocessed
portions of the arrays
• copy the smaller of the two into A, while incrementing the
index indicating the unprocessed portion of that array
Once all elements in one of the arrays are processed, copy the
remaining unprocessed elements from the other array into A.
-8
Merge Sort
John von Neumann 1945.
MERGE-SORT A[1 . . n] Divide
1. If n = 1, done.
2. Recursively sort A[ 1 . .
n/2 ] and A[ n/2+1 . . n ] .
3. “Merge” the 2 sorted lists.
Conque
r Combine
Key subroutine: “Merge”
9
-9
Merge-Sort (A, p, r)
INPUT: a sequence of n numbers stored in array A
OUTPUT: an ordered sequence of n numbers
- 10 Comp 122
Merge Sort – Example
Original Sequence Sorted Sequence
18 26 32 6 43 15 9 1 1 6 9 15 18 26 32 43
18 26 32 6 43 15 9 1 6 18 26 32 1 9 15 43
43
18 26 32 6 43 15 9 1 18 26 6 32 15 43 1 9
18 26 32 6 43 15 9 1 18 26 32 6 43 15 9 1
18 26 32 6 43 15 9 1
- 12 Comp 122
Merge-Sort (A, p, r)
INPUT: a sequence of n numbers stored in array A
OUTPUT: an ordered sequence of n numbers
- 13 Comp 122
Procedure Merge
Merge(A, p, q, r)
1 n1 q – p + 1
2 n2 r – q Input: Array containing
3 for i 1 to n1
4 do L[i] A[p + i – 1]
sorted subarrays A[p..q]
5 for j 1 to n2 and A[q+1..r].
6 do R[j] A[q + j]
7 L[n1+1] Output: Merged sorted
8 R[n2+1] subarray in A[p..r].
9 i1
10 j 1
11 for k p to r
12 do if L[i] R[j]
13 then A[k] L[i]
14 ii+1 Sentinels, to avoid having to
15 else A[k] R[j] check if either subarray is
16 jj+1
fully copied at each step.
- 14 Comp 122
Merge – Example
A … 61 86 26
8 32 1 32
9 26 9 42 43 …
k k k k k k k k k
L 6 8 26 32 R 1 9 42 43
i i i i i j j j j j
- 15 Comp 122
Correctness of Merge
Merge(A, p, q, r)
Loop Invariant for the for loop
1 n1 q – p + 1
2 n2 r – q
At the start of each iteration of the
3 for i 1 to n1 for loop:
4 do L[i] A[p + i – 1] Subarray A[p..k – 1]
5 for j 1 to n2 contains the k – p smallest elements
6 do R[j] A[q + j] of L and R in sorted order.
7 L[n1+1] L[i] and R[j] are the smallest elements of
8 R[n2+1] L and R that have not been copied back into
9 i1
A.
10 j 1
11 for k p to r
12 do if L[i] R[j]
13 then A[k] L[i] Initialization:
14 ii+1 Before the first iteration:
15 else A[k] R[j] •A[p..k – 1] is empty.
16 jj+1 •i = j = 1.
•L[1] and R[1] are the smallest
elements of L and R not copied to A.
- 16 Comp 122
Correctness of Merge
Merge(A, p, q, r) Maintenance:
1 n1 q – p + 1
2 n2 r – q Case 1: L[i] R[j]
3 for i 1 to n1 •By LI, A contains p – k smallest elements
4 do L[i] A[p + i – 1] of L and R in sorted order.
5 for j 1 to n2 •By LI, L[i] and R[j] are the smallest
6 do R[j] A[q + j] elements of L and R not yet copied into A.
7 L[n1+1] •Line 13 results in A containing p – k + 1
8 R[n2+1] smallest elements (again in sorted order).
9 i1
Incrementing i and k reestablishes the LI
10 j 1
11 for k p to r for the next iteration.
12 do if L[i] R[j] Similarly for L[i] > R[j].
13 then A[k] L[i] Termination:
14 ii+1 •On termination, k = r + 1.
15 else A[k] R[j]
•By LI, A contains r – p + 1 smallest
16 jj+1
elements of L and R in sorted order.
•L and R together contain r – p + 3 elements.
All but the two sentinels have been copied
back into A.
- 17 Comp 122
Analysis of Merge Sort
Running time T(n) of Merge Sort:
Merge Sort work correctly with any input i.e even or
odd. But our analysis assume that input size is a power
of 2.
Merge sort on one element takes constant time
whereas when we have n > 1 eleements, break down
the running time as follows.
When an algorithm contains a recursive call to itself,
we describe its running time by a recurrence equation
- 18 Comp 122
Analysis of Merge Sort
Divide: computing the middle of subarray, takes
D(n) = (1)
Conquer: solving 2 subproblems of size n/2 takes
T(n) = 2T(n/2)
Combine: merging n elements takes C(n) = (n)
Total:
T(n) = (1) if n = 1
T(n) = 2T(n/2) + (n) if n > 1
T(n) = (n lg n) (can be proved using master theorem ,
as logrithmic function grow more slowly than any linear
function. Chapter 4)
- 19 Comp 122
Recurrence Relations
Solution Methods (Chapter 4)
Substitution Method.
Recursion-tree Method.
Master Method.
Recurrence relations arise when we analyze the
running time of iterative or recursive algorithms.
Ex: Divide and Conquer.
T(n) = (1) if n c
T(n) = a T(n/b) + D(n) + C(n) otherwise
- 20 Comp 122
Substitution Method
Guess the form of the solution, then
use mathematical induction to show it correct.
Substitute guessed answer for the function when the
inductive hypothesis is applied to smaller values –
hence, the name.
Works well when the solution is easy to guess.
No general way to guess the correct solution.
- 21 Comp 122
Example – Exact Function
Recurrence: T(n) = 1 if n = 1
T(n) = 2T(n/2) + n if n > 1
Guess: T(n) = n lg n + n.
Induction:
•Basis: n = 1 n lgn + n = 1 = T(n).
•Hypothesis: T(k) = k lg k + k for all k < n.
•Inductive Step: T(n) = 2 T(n/2) + n
= 2 ((n/2)lg(n/2) + (n/2)) + n
= n (lg(n/2)) + 2n
= n (lg(n) – lg(2)) + 2n
= n lg n – nlg(2) + 2n
= n lg n – n + 2n ; log2 2 = 1
- 22 = n lg n + n
Recursion-tree Method
Making a good guess is sometimes difficult
with the substitution method.
Use recursion trees to devise good guesses.
Recursion Trees
Show successive expansions of recurrences using
trees.
Keep track of the time spent on the subproblems of a
divide and conquer algorithm.
Help organize the algebraic bookkeeping necessary
to solve a recurrence.
- 23 Comp 122
Recursion Tree – Example
Running time of Merge Sort:
T(n) = (1) if n = 1
T(n) = 2T(n/2) + (n) if n > 1
Rewrite the recurrence as
T(n) = c if n = 1
T(n) = 2T(n/2) + cn if n > 1
c > 0: Running time for the base case and
time per array element for the divide and
combine steps.
- 24 Comp 122
Recursion Tree for Merge Sort
For the original problem, Each of the size n/2 problems
we have a cost of cn, has a cost of cn/2 plus two
plus two subproblems subproblems, each costing
each of size (n/2) and T(n/4).
running time T(n/2). cn
cn/2 cn/2
T(n/2) T(n/2)
T(n/4) T(n/4) T(n/4) T(n/4)
Cost of sorting
subproblems.
- 25 Comp 122
Recursion Tree for Merge Sort
Continue expanding until the problem size reduces to 1.
cn cn
cn/2 cn/2 cn
lg n
cn/4 cn/4 cn/4 cn/4 cn
c c c c c c cn
Total : cnlgn+cn
- 26 Comp 122
Recursion Tree for Merge Sort
Continue expanding until the problem size reduces to 1.
cn
•Each level has total cost cn.
•Each time we go down one level,
the number of subproblems
doubles, but the cost per
cn/2 cn/2
subproblem halves cost per
level remains the same.
•There are lg n + 1 levels, height is
cn/4 cn/4 cn/4 cn/4 lg n. (Assuming n is a power of 2.)
•Can be proved by induction.
•Total cost = sum of costs at each
level = (lg n + 1)cn = cnlgn + cn =
(n lgn).
c c c c c c
- 27 Comp 122
Other Examples
Use the recursion-tree method to determine a
guess for the recurrences
T(n) = 3T(n/4) + (n2).
T(n) = T(n/3) + T(2n/3) + O(n).
- 28 Comp 122
Recursion Trees – Caution Note
Recursion trees only generate guesses.
Verify guesses using substitution method.
A small amount of “sloppiness” can be
tolerated. Why?
If careful when drawing out a recursion tree and
summing the costs, can be used as direct proof.
- 29 Comp 122
The Master Method
Based on the Master theorem.
“Cookbook” approach for solving recurrences
of the form
T(n) = aT(n/b) + f(n)
• a 1, b > 1 are constants.
• f(n) is asymptotically positive.
• n/b may not be an integer, but we ignore floors and
ceilings. Why?
Requires memorization of three cases.
- 30 Comp 122
The Master Theorem
Theorem
Theorem4.1 4.1
Letaa 11 and
Let and bb >> 11be beconstants,
constants,let
letf(n)
f(n)be beaafunction,
function,and
and
Let
LetT(n)
T(n)bebedefined
definedon onnonnegative
nonnegativeintegers
integersby bythe
therecurrence
recurrence
T(n)
T(n)==aT(n/b)
aT(n/b)++f(n), f(n),where
wherewe wecan
canreplace
replacen/b n/bbybyn/b orn/b.
n/bor n/b.
T(n)
T(n)cancanbebebounded
boundedasymptotically
asymptoticallyin inthree
threecases:
cases:
a–) for some constant > 0, then T(n) = (nlog
1.1. IfIf f(n) = O(n ) for some constant > 0, then T(n) = (n ). logaa).
log
loga–
f(n) = O(n b
b
b
b
f(n)==
2.2. IfIf f(n) (n(nlog a),),then T(n)==(
thenT(n) (nnloglogaalglgn).
log a
b
b
b
b n).
f(n)==(n
3.3. IfIf f(n) (nlog a+)) forforsome constant>>0,0,
someconstant
log a+
b
b
and
andif,
if,for
forsome
someconstant
constantcc<<11andandall allsufficiently
sufficientlylarge
largen,n,
we
wehave a·f(n/b)ccf(n),
havea·f(n/b) f(n),then T(n)==
thenT(n) (f(n)).
(f(n)).
- 31 Comp 122