Module 2 AOA
Module 2 AOA
Quicksort
QuickSort Design
● Follows the divide-and-conquer paradigm.
● Divide: Partition (separate) the array A[p..r] into two
(possibly empty) subarrays A[p..q–1] and A[q+1..r].
● Each element in A[p..q–1] < A[q].
● A[q] < each element in A[q+1..r].
● Index q is computed as part of the partitioning procedure.
● Conquer: Sort the two subarrays by recursive calls to
quicksort.
● Combine: The subarrays are sorted in place – no
work is needed to combine them.
● How do the divide and combine steps of quicksort
compare with those of merge sort?
Pseudocode
Quicksort(A, p, r)
if p < r then Partition(A, p, r)
q := Partition(A, p, r); x := A[r], i :=p-1;
Quicksort(A, p, q – 1); for j := p to r – 1 do
Quicksort(A, q + 1, r) if A[j] ≤ x then
i := i + 1;
A[p..r] A[i] ↔ A[j]
A[i + 1] ↔ A[r];
5 return i + 1
A[p..q – A[q+1..r]
1]
Partition
5
≤ ≥
5 5
Example
p r
initially: 2 5 8 3 9 4 1 7 10 6 note: pivot (x) = 6
i j
next iteration: 2 5 8 3 9 4 1 7 10 6
i j Partition(A, p, r)
x := A[r], i := p – 1;
next iteration: 2 5 8 3 9 4 1 7 10 6 for j := p to r – 1 do
i j if A[j] ≤ x then
i := i + 1;
next iteration: 2 5 8 3 9 4 1 7 10 6 A[i] ↔ A[j]
i j A[i + 1] ↔ A[r];
return i + 1
next iteration: 2 5 3 8 9 4 1 7 10 6
i j
Example (Continued)
next iteration: 2 5 3 8 9 4 1 7 10 6
i j
next iteration: 2 5 3 8 9 4 1 7 10 6
i j
next iteration: 2 5 3 4 9 8 1 7 10 6 Partition(A, p, r)
i j x, i := A[r], p – 1;
next iteration: 2 5 3 4 1 8 9 7 10 6 for j := p to r – 1 do
i j if A[j] ≤ x then
next iteration: 2 5 3 4 1 8 9 7 10 6 i := i + 1;
i j A[i] ↔ A[j]
next iteration: 2 5 3 4 1 8 9 7 10 6 A[i + 1] ↔ A[r];
i j return i + 1
after final swap: 2 5 3 4 1 6 9 7 10 8
i j
Partitioning
● Select the last element A[r] in the subarray
A[p..r] as the pivot – the element around which
to partition.
● As the procedure executes, the array is
partitioned into four (possibly empty) regions.
1. A[p..i ] — All entries in this region are < pivot.
2. A[i+1..j – 1] — All entries in this region are > pivot.
3. A[r] = pivot.
4. A[j..r – 1] — Not known how they compare to pivot.
● The above hold before each iteration of the for
loop, and constitute a loop invariant. (4 is not part
of the loopi.)
Correctness of Partition
● Use loop invariant.
● Initialization:
● Before first iteration
A[p..i] and A[i+1..j – 1] are empty – Conds. 1 and 2 are satisfied
●
(trivially).
● r is the index of the pivot Partition(A, p, r)
● Cond. 3 is satisfied. x, i := A[r], p – 1;
for j := p to r – 1 do
● Maintenance:
if A[j] ≤ x then
● Case 1: A[j] > x i := i + 1;
● Increment j only. A[i] ↔ A[j]
● Loop Invariant is maintained. A[i + 1] ↔ A[r];
return i + 1
Correctness of Partition
Case
1:
p i j r
>x x
≤ >
p x i x j r
x
≤ >
x x
Correctness of Partition
● Case 2: A[j] ≤ x ● Increment j
● Increment i ● Condition 2 is
maintained.
● Swap A[i] and A[j]
● A[r] is unaltered.
● Condition 1 is
maintained. ● Condition 3 is
maintained.
p i j r
≤x x
≤ >
p x i x j r
x
≤ >
x x
Correctness of Partition
● Termination:
● When the loop terminates, j = r, so all elements
in A are partitioned into one of the three cases:
● A[p..i] ≤ pivot
● A[i+1..j – 1] > pivot
● A[r] = pivot
● The last two lines swap A[i+1] and A[r].
● Pivot moves from the end of the array to
between the two subarrays.
● Thus, procedure partition correctly performs
the divide step.
Complexity of Partition
● To sort a[left...right]:
1. if left < right:
1.1. Partition a[left...right] such that:
all a[left...p-1] are less than a[p], and
all a[p+1...right] are >= a[p]
1.2. Quicksort a[left...p-1]
1.3. Quicksort a[p+1...right]
2. Terminate
Partitioning in Quicksort
if p < r
then q ← PARTITION(A, p, r)
= Θ(n2)
16
Case Between Worst and Best
● 9-to-1 proportional split
Q(n) = Q(9n/10) + Q(n/10) + n
17
Analysis of Algorithms
Merge Sort and Quick Sort
Sorting
• Insertion sort
– Design approach: incremental
– Sorts in place: Yes
– Best case: Θ(n)
– Worst case: Θ
(n2)
• Bubble Sort
– Design approach: incremental
– Sorts in place: Yes
– Running time: Θ
(n2)
2
Sorting
• Selection sort
– Design approach: incremental
– Sorts in place: Yes
– Running time: Θ
(n2)
• Merge Sort
– Design approach: divide and conquer
– Sorts in place:
No
– Running time: Let’s see!!
3
Divide-and-Conquer
• Divide the problem into a number of sub-problems
– Similar sub-problems of smaller size
4
Merge Sort Approach
• To sort an array A[p . . r]:
• Divide
– Divide the n-element sequence to be sorted into two
subsequences of n/2 elements each
• Conquer
– Sort the subsequences recursively using merge sort
– When the size of the sequences is 1 there is nothing
more to do
• Combine
– Merge the two sorted subsequences
5
Merge Sort
p q r
1 2 3 4 5 6 7 8
Alg.: MERGE-SORT(A, p, r)
5 2 4 7 1 3 2 6
6
Example – n Power of 2
1 2 3 4 5 6 7 8
Divide 5 2 4 7 1 3 2 6 q=4
1 2 3 4 5 6 7 8
5 2 4 7 1 3 2 6
1 2 3 4 5 6 7 8
5 2 4 7 1 3 2 6
1 2 3 4 5 6 7 8
5 2 4 7 1 3 2 6
7
Example – n Power of 2
1 2 3 4 5 6 7 8
Conquer 1 2 2 3 4 5 6 7
and
Merge 1 2 3 4 5 6 7 8
2 4 5 7 1 2 3 6
1 2 3 4 5 6 7 8
2 5 4 7 1 3 2 6
1 2 3 4 5 6 7 8
5 2 4 7 1 3 2 6
8
Example – n Not a Power of 2
1 2 3 4 5 6 7 8 9 10 11
4 7 2 6 1 4 7 3 5 2 6 q=6
Divide
1 2 3 4 5 6 7 8 9 10 11
q=3 4 7 2 6 1 4 7 3 5 2 6 q=9
1 2 3 4 5 6 7 8 9 10 11
4 7 2 6 1 4 7 3 5 2 6
1 2 3 4 5 6 7 8 9 10 11
4 7 2 6 1 4 7 3 5 2 6
1 2 4 5 7 8
4 7 6 1 7 3
9
Example – n Not a Power of 2
1 2 3 4 5 6 7 8 9 10 11
Conquer 1 2 2 3 4 4 5 6 6 7 7
and
1 2 3 4 5 6 7 8 9 10 11
Merge 1 2 4 4 6 7 2 3 5 6 7
1 2 3 4 5 6 7 8 9 10 11
2 4 7 1 4 6 3 5 7 2 6
1 2 3 4 5 6 7 8 9 10 11
4 7 2 1 6 4 3 7 5 2 6
1 2 4 5 7 8
4 7 6 1 7 3
10
Merging
p q r
1 2 3 4 5 6 7 8
2 4 5 7 1 2 3 6
11
Merging
p q r
• Idea for merging: 1 2 3 4 5 6 7 8
2 4 5 7 1 2 3 6
– Two piles of sorted cards
• Choose the smaller of the two top cards
• Remove it and place it in the output pile
– Repeat the process until one pile is empty
– Take the remaining input pile and place it face-down
onto the output pile
A1 A[p, q]
A[p, r]
A2 A[q+1, r]
12
Example: MERGE(A, 9, 12, 16)
p q r
13
Example: MERGE(A, 9, 12, 16)
14
Example (cont.)
15
Example (cont.)
16
Example (cont.)
Done!
17
Merge - Pseudocode
p q r
Alg.: MERGE(A, p, q, r) 1 2 3 4 5 6 7 8
2 4 5 7 1 2 3 6
1. Compute n1 and n2
2. Copy the first n1 elements into n1 L[1 . .n n1
2
+ 1] and the next n2 elements into R[1 . . n2 + 1]
3. L[n1 + 1] ← ∞; R[n2 + 1] ← ∞ p q
4. i ← 1; j ← 1 L 2 4 5 7 ∞
5. for k ← p to r q+1 r
6. do if L[ i ] ≤ R[ j ] R 1 2 3 6 ∞
7. then A[k] ← L[ i ]
8. i ←i + 1
9. else A[k] ← R[ j ]
10. j←j+1
18
Running Time of Merge
(assume last for loop)
• Initialization (copying into temporary arrays):
– Θ(n1 + n2) = Θ(n)
• Adding the elements to the final array:
- n iterations, each taking constant time ⇒ Θ(n)
• Total time for Merge:
– Θ(n)
19
Analyzing Divide-and Conquer Algorithms
• The recurrence is based on the three steps of
the paradigm:
– T(n) – running time on a problem of size n
– Divide the problem into a subproblems, each of size
n/b: takes D(n)
– Conquer (solve) the subproblems aT(n/b)
– Combine the solutions C(n)
Θ(1) if n ≤ c
T(n) = aT(n/b) + D(n) + C(n) otherwise
20
MERGE-SORT Running Time
• Divide:
– compute q as the average of p and r: D(n) = Θ(1)
• Conquer:
– recursively solve 2 subproblems, each of size n/2
⇒ 2T (n/2)
• Combine:
– MERGE on an n-element subarray takes Θ(n) time
⇒ C(n) = Θ(n)
Θ(1) if n =1
T(n) = 2T(n/2) + Θ(n) if n > 1
21
Solve the Recurrence
T(n) = c if n = 1
2T(n/2) + cn if n > 1
22
Merge Sort - Discussion
• Running time insensitive of the input
• Advantages:
– Guaranteed to run in Θ(nlgn)
• Disadvantage
– Requires extra space ≈N
23
Sorting Challenge 1
Problem: Sort a file of huge records with tiny
keys
Example application: Reorganize your MP-3 files
24
Sorting Files with Huge Records and
Small Keys
• Insertion sort or bubble sort?
• Selection sort?
25
Sorting Challenge 2
Problem: Sort a huge randomly-ordered file of
small records
Application: Process transaction record for a
phone company
26
Sorting Huge, Randomly - Ordered Files
• Selection sort?
– NO, always takes quadratic time
• Bubble sort?
– NO, quadratic time for randomly-ordered keys
• Insertion sort?
– NO, quadratic time for randomly-ordered keys
• Mergesort?
– YES, it is designed for this problem
27
Sorting Challenge 3
Problem: sort a file that is already almost in
order
Applications:
– Re-sort a huge database after a few changes
– Doublecheck that someone else sorted a file
Which sorting method to use?
A. Mergesort, guaranteed to run in time ~NlgN
B. Selection sort
C. Bubble sort
D. A custom algorithm for almost in-order files
E. Insertion sort
28
Sorting Files That are Almost in Order
• Selection sort?
– NO, always takes quadratic time
• Bubble sort?
– NO, bad for some definitions of “almost in order”
– Ex: B C D E F G H I J K L M N O P Q R S T U V W X Y Z A
• Insertion sort?
– YES, takes linear time for most definitions of “almost
in order”
• Mergesort or custom method?
– Probably not: insertion sort simpler and faster
29