1 Counting Sort
1 Counting Sort
CS532
LECTURE 8
Counting Sort
Suppose that the values to be sorted are all between 0 and k, where k is some (small) integer. Assume the values are in the array A[1..n]. 1. Use an array C[0..k] to count how many times each key occurs in A. This requires (n) time.
A: 2 5 3 0 2 3 0 3
1 2 8
C:
2 0 2 3 0 1
0 1 2 3 4 5
C:
2 2 4 7 7 8
0 1 2 3 4 5
These numbers tell us, for example, that the three occurrences of 3 should be in places 5,6,7 of the nal array. 3. Copy data into the target array B. The time is (n)
1 2 3 4 5 6 7 8 0 1 2 3
1 2
0
0 1 2 3
0
0 1 2
0
0
3 B
4 5
2 2 4 6 7 8 C
3 4 5 6 7 8
3 B
4 5
1 2 4 6 7 8 C
3 4 5 6 7 8
3 3 B
3 4 5
1 2 4 5 7 8 C
3 4 5 6 7 8
2 3 3 B
1 2 3 4 5
1 2 3 5 7 8 C
Assuming k = O(n), the total time is O(n)better than any comparison sort. Note that the counting sort is stable: it preserves the ordering of elements that have the same key. (Previously seen sorting algorithms do not have this property, but some do have stable versions.)
Radix Sort
Suppose that the values to be sorted are written as d-digit numbers. Use some stable sort to sort them by last digit. Then stable sort them by the second least signicant digit, then by the third, etc. If we use counting sort as the stable sort, the total time is O(nd), i.e. O(n).
Bucket Sort
Suppose that we want to sort n items that are evenly distributed over the interval [0, D]. We split the interval [0, D] into n equal buckets (subintervals ) [0, d) [d, 2d) [2d, 3d) [(n 1)d, nd] (d = D/n)
We sort the data by distributing it to appropriate buckets, then sort each bucket, then just concatenate the results. Complexity. The expected time Ei to sort the ith bucket turns out [with some computation] to be Ei = 2 1 = O(1). n
Therefore, the expected time for the whole algorithm is E1 + En = nO(1) = O(n).
The ith order statistic of a set of n elements is the ith smallest element in the set. Thus, the minimum and the maximum are the 1st and nth order statistics. The median(s) are the n+1 th and n+1 th order statistics. 2 2 We can compute each statistic in O(n log n) time just by sorting the input and then selecting the right entry. But sometimes we can obviuosly do better: nding the maximum, for example, takes only n 1 comparisons. Example. How many comparisons are needed to nd both maximum and minimum? Suppose there are 2k elements. We process them in pairs a1 , a2 , a3 , a4 , a5 , a6 . . . a2k1 , a2k
and at each step adjust the current minimum (m) and maximum (M ) values. The work done at the ith step is 3
to compare a2i1 with a2i , then compare the smaller with m, then compare the larger with M . The total is less than 3k, much less than computing maximum and minimum separately. Theorem. It is possible to compute the k th order statistic in O(n) time. A modication of QuickSort can be used to nd the order statistics in expected linear time.
QuickSelect(A, i, j, k) if i = j return A[i] else p Partition(A, i, j) q p i + 1 if k < q return QuickSelect(A, i, p 1, k) else return QuickSelect(A, p + 1, j, k q)
Comparator Networks
Sorting networks are build from comparator components, each of which can be used to arrange two given values in sorted order:
y x max(x,y) min(x,y)
x y
max(x,y) min(x,y)
depth = 5
depth = 3
Assuming we can run comparators in parallel, the running time is proportional to the depth of the network.
Suppose n is a power of 2. We will build a network that sorts n inputs using Merger boxes. We will assume it sorts 0-1 sequences when we argue about its correctness. Using the Zero-One Principle, we will then know that it also sorts items of any other type correctly. The design of the sorter:
reverse
A bitonic sequence is one that increases and then decreases, or decreases and then increases. (Examples: 11100001, 6
bitonic sorter
001100, 00011, 000.) A bitonic sorter is a network that sorts any bitonic sequence given as input. The key component of a bitonic sorter is HalfCleaner:
If the input to HalfCleaner is a bitonic 0-1 sequence, then - both halves of the output are bitonic - every element in the bottom half is at least as small as every element in the top - (at least one half is cleanall 0s or all 1s) Proof.
0s 1s 0s 1s 0s 0s 1s 1s 0s 0s 1s 0s 1s
1s
1s
0s
Complexity. The depth of BitonicSorter(n) is log n. The reversal is just rewiring (takes no time), so the depth of Merger(n) is also log n. Looking at the picture of the whole network, we see that its depth D(n) satises the recurrence relation D(n) = D It follows that D(n) = log n + log n n + log + + 1 2 4 n + log n. 2
There are log n terms in this sum and each is log n. Thus, D(n) is O(log2 n). We can sort n numbers in O(log2 n) time on a parallel machine.