13. Sorting in Linear Time
13. Sorting in Linear Time
Kyuseok Shim
Electrical and Computer Engineering
Seoul National University
Outline
All the sorting algorithms introduced thus far are comparison sorts since
the sorted order they determine is based only on comparisons between
the input elements.
We prove that any comparison sort must make Ω 𝑛 log 𝑛 comparisons
in the worst case to sort n elements.
Thus, merge sort and heapsort are asymptotically optimal, and no
comparison sort exists that is faster by more than a constant factor.
We examine other sorting algorithms, such as counting sort and radix
sort that run in linear time.
Those algorithms use operations other than comparisons to determine
the sorted order.
Consequently, the Ω 𝑛 log 𝑛 lower bound does not apply to them.
Lower Bounds for Sorting
Comparison sorting
Use only comparisons between elements to gain order information about an
input sequence <a1, a2,…,an>.
Given a two elements ai and aj, we perform only one of the tests ai < aj, ai
≤ aj, ai ≥ aj, or ai > aj to determine their relative order.
Our assumption
All input elements are distinct (i.e., we do not check ai = aj).
The comparisons ai < aj, ai ≤ aj, ai ≥ aj, or ai > aj are all equivalent in that
they yield identical information about the relative order of ai and aj.
Thus, all comparisons have the form ai ≤ aj.
Decision Tree Model
A Decision tree is a full binary tree representing the comparisons
between elements performed by a particular sorting algorithm
operating on an input of a given size.
In a decision tree, we annotate each internal node by i:j for some i and
j in the range 1 ≤ i, j ≤ n, where n is the number of elements in the
input sequence - each internal node indicates a comparison ai≤aj.
We also annotate each leaf by a permutation 𝜋 1 , 𝜋 2 , …, 𝜋 𝑛 .
The execution of the sorting algorithm corresponds to tracing a simple
path from the root of the decision tree down to a leaf.
When we come to a leaf, the sorting algorithm has established the
ordering a𝜋 1 ≤ a𝜋 2 ≤…≤ a𝜋 𝑛 .
Decision Tree Model
We consider only decision trees in which each permutation appears as a
reachable leaf.
Because any correct sorting algorithm must be able to produce each
permutation of its input, each of the n! permutations on n elements must
appear as one of the leaves of the decision tree for a comparison sort to be
correct.
Furthermore, each of these leaves must be reachable from the root by a
downward path corresponding to an actual execution of the comparison
sort.
Insertion Sort
index 1 2 3
1:2
≤
value 6 8 5
sorted
index 1 2 3
1:2
≤
value 6 8 5
2:3
sorted >
a1= 6, a2=8, a3=5
Insertion Sort
index 1 2 3
1:2
≤
value 6 5 8
2:3
sorted >
a1= 6, a2=8, a3=5
1:3
>
Insertion Sort
index 1 2 3
1:2
≤
value 5 6 8
2:3
sorted >
a1= 6, a2=8, a3=5
1:3
>
<3,1,2>
The Decision tree for Insertion
Sort
The decision tree corresponding to the insertion sort algorithm operating on an
input sequence of three elements.
1:2
≤ >
2:3 1:3
≤ > ≤ >
ℎ=3
ℎ=2
ℎ=1
ℎ=0
A Lower Bound for the Worst
Case
Theorem 8.1
Any comparison sort algorithm requires Ω 𝑛 log 𝑛 comparisons in the worst-case.
Proof
It suffices to determine the height of a decision tree in which each permutation
appears as a reachable leaf.
Consider a decision tree of height ℎ with 𝑙 reachable leaves corresponding to a
comparison sort on n elements.
Because each of the 𝑛! Permutations of the input appears as some leaf, 𝑛! ≤ 𝑙.
Since a binary tree of height h has no more than 2ℎ , we have
𝑛! ≤ 𝑙 ≤ 2ℎ .
Thus, ℎ ≥ log 𝑛!
= log 𝑛 𝑛 − 1 𝑛 − 2 … 1
A Lower Bound for the Worst
Case
Theorem 8.1
Any comparison wort algorithm requires Ω 𝑛 log 𝑛 comparisons in the worst-case.
Proof
It suffices to determine the height of a decision tree in which each permutation
appears as a reachable leaf.
Consider a decision tree of height ℎ with 𝑙 reachable leaves corresponding to a
comparison sort on n elements.
Because each of the 𝑛! Permutations of the input appears as some leaf, 𝑛! ≤ 𝑙.
Since a binary tree of height h has no more than 2ℎ leaf nodes, we have
𝑛! ≤ 𝑙 ≤ 2ℎ .
Thus, ℎ ≥ log 𝑛!
= log 𝑛 𝑛 − 1 𝑛 − 2 … 1
= log 𝑛 + log 𝑛 − 1 + ⋯ + log 1
A Lower Bound for the Worst
Case
Theorem 8.1
Any comparison wort algorithm requires Ω 𝑛 log 𝑛 comparisons in the worst-case.
Proof
It suffices to determine the height of a decision tree in which each permutation
appears as a reachable leaf.
Consider a decision tree of height ℎ with 𝑙 reachable leaves corresponding to a
comparison sort on n elements.
Because each of the 𝑛! Permutations of the input appears as some leaf, 𝑛! ≤ 𝑙.
Since a binary tree of height h has no more than 2ℎ leaf nodes, we have
𝑛! ≤ 𝑙 ≤ 2ℎ .
Thus, ℎ ≥ log 𝑛!
= log 𝑛 𝑛 − 1 𝑛 − 2 … 1
= log 𝑛 + log 𝑛 − 1 + ⋯ + log 1
𝑛
≥ log 𝑛 + log 𝑛 − 1 + ⋯ + log
2
A Lower Bound for the Worst
Case
Theorem 8.1
Any comparison wort algorithm requires Ω 𝑛 log 𝑛 comparisons in the worst-case.
Proof
It suffices to determine the height of a decision tree in which each permutation
appears as a reachable leaf.
Consider a decision tree of height ℎ with 𝑙 reachable leaves corresponding to a
comparison sort on n elements.
Because each of the 𝑛! Permutations of the input appears as some leaf, 𝑛! ≤ 𝑙.
Since a binary tree of height h has no more than 2ℎ leaf nodes, we have
𝑛! ≤ 𝑙 ≤ 2ℎ .
Thus, ℎ ≥ log 𝑛!
= log 𝑛 𝑛 − 1 𝑛 − 2 … 1
= log 𝑛 + log 𝑛 − 1 + ⋯ + log 1
𝑛
≥ log 𝑛 + log 𝑛 − 1 + ⋯ + log
2
𝑛 𝑛
≥ log = Ω 𝑛 log 𝑛
2 2
A Lower Bound for the Worst
Case
Corollary 8.2
Heapsort and merge sort are asymptotically optimal comparison sorts.
Proof
The O(n lg n) upper bounds on the running times for heapsort and merge
sort match the Ω 𝑛 log 𝑛 worst-case lower bound from Theorem 8.1.
Counting Sort
Assumes that each of the n input elements is an integer in
the range 1 to k, for some integer k.
When k = O(n), the sort runs in Θ(n) time.
Use three arrays
𝐴[1. . 𝑛]: the initial input
1 2 3 4 5 1 2 3 4
B 3 C 2 2 3 5
1 2 3 4 5 1 2 3 4
B 1 3 C 1 2 3 5
1 2 3 4 5 1 2 3 4
B 1 3 3 C 1 2 2 5
1 2 3 4 5 1 2 3 4 Running Time: O(n)
B 1 3 3 4 C 1 2 2 4
1 2 3 4 5 1 2 3 4
B 1 1 3 3 4 C 0 2 2 4
Counting Sort
COUNTING-SORT(A, B, k)
1. let C[1..k] be a new array
2. for i=1 to k
3. C[i]=0
4. for j=1 to A.length
5. C[A[j]] = C[A[j]] + 1
6. // C[i] has the number of elements of A equal to i.
7. for i= 2 to k
8. C[i] = C[i] + C[i-1]
9. // C[i] has the number of elements of A that is at most i.
10. for j=A.length down to 1
11. B[C[A[j]]] = A[j]
12. C[A[j]] = C[A[j]] – 1
1 2 3 4 5
A
1 4 3 1 3
1 2 3 4 5
B
Counting Sort
COUNTING-SORT(A, B, k)
1. let C[1..k] be a new array
2. for i=1 to k
3. C[i]=0
4. for j=1 to A.length
5. C[A[j]] = C[A[j]] + 1
6. // C[i] has the number of elements of A equal to i.
7. for i= 2 to k
8. C[i] = C[i] + C[i-1]
9. // C[i] has the number of elements of A that is at most i.
10. for j=A.length down to 1
11. B[C[A[j]]] = A[j]
12. C[A[j]] = C[A[j]] – 1
1 2 3 4 5
A
1 4 3 1 3
1 2 3 4 5 1 2 3 4
B C
Counting Sort
COUNTING-SORT(A, B, k)
1. let C[1..k] be a new array
2. for i=1 to k
3. C[i]=0
4. for j=1 to A.length
5. C[A[j]] = C[A[j]] + 1
6. // C[i] has the number of elements of A equal to i.
7. for i= 2 to k
8. C[i] = C[i] + C[i-1]
9. // C[i] has the number of elements of A that is at most i.
10. for j=A.length down to 1
11. B[C[A[j]]] = A[j]
12. C[A[j]] = C[A[j]] – 1
1 2 3 4 5
A
1 4 3 1 3
1 2 3 4 5 1 2 3 4
B C 0 0 0 0
Counting Sort
COUNTING-SORT(A, B, k)
1. let C[1..k] be a new array
2. for i=1 to k
3. C[i]=0
4. for j=1 to A.length
5. C[A[j]] = C[A[j]] + 1
6. // C[i] has the number of elements of A equal to i.
7. for i= 2 to k
8. C[i] = C[i] + C[i-1]
9. // C[i] has the number of elements of A that is at most i.
10. for j=A.length down to 1
11. B[C[A[j]]] = A[j]
12. C[A[j]] = C[A[j]] – 1
1 2 3 4 5
A
1 4 3 1 3
1 2 3 4 5 1 2 3 4
B C 1 0 0 0
Counting Sort
COUNTING-SORT(A, B, k)
1. let C[1..k] be a new array
2. for i=1 to k
3. C[i]=0
4. for j=1 to A.length
5. C[A[j]] = C[A[j]] + 1
6. // C[i] has the number of elements of A equal to i.
7. for i= 2 to k
8. C[i] = C[i] + C[i-1]
9. // C[i] has the number of elements of A that is at most i.
10. for j=A.length down to 1
11. B[C[A[j]]] = A[j]
12. C[A[j]] = C[A[j]] – 1
1 2 3 4 5
A
1 4 3 1 3
1 2 3 4 5 1 2 3 4
B C 1 0 0 1
Counting Sort
COUNTING-SORT(A, B, k)
1. let C[1..k] be a new array
2. for i=1 to k
3. C[i]=0
4. for j=1 to A.length
5. C[A[j]] = C[A[j]] + 1
6. // C[i] has the number of elements of A equal to i.
7. for i= 2 to k
8. C[i] = C[i] + C[i-1]
9. // C[i] has the number of elements of A that is at most i.
10. for j=A.length down to 1
11. B[C[A[j]]] = A[j]
12. C[A[j]] = C[A[j]] – 1
1 2 3 4 5
A
1 4 3 1 3
1 2 3 4 5 1 2 3 4
B C 1 0 1 1
Counting Sort
COUNTING-SORT(A, B, k)
1. let C[1..k] be a new array
2. for i=1 to k
3. C[i]=0
4. for j=1 to A.length
5. C[A[j]] = C[A[j]] + 1
6. // C[i] has the number of elements of A equal to i.
7. for i= 2 to k
8. C[i] = C[i] + C[i-1]
9. // C[i] has the number of elements of A that is at most i.
10. for j=A.length down to 1
11. B[C[A[j]]] = A[j]
12. C[A[j]] = C[A[j]] – 1
1 2 3 4 5
A 1 4 3 1 3
1 2 3 4 5 1 2 3 4
B C 2 0 1 1
Counting Sort
COUNTING-SORT(A, B, k)
1. let C[1..k] be a new array
2. for i=1 to k
3. C[i]=0
4. for j=1 to A.length
5. C[A[j]] = C[A[j]] + 1
6. // C[i] has the number of elements of A equal to i.
7. for i= 2 to k
8. C[i] = C[i] + C[i-1]
9. // C[i] has the number of elements of A that is at most i.
10. for j=A.length down to 1
11. B[C[A[j]]] = A[j]
12. C[A[j]] = C[A[j]] – 1
1 2 3 4 5
A
1 4 3 1 3
1 2 3 4 5 1 2 3 4
B C 2 0 2 1
Counting Sort
COUNTING-SORT(A, B, k)
1. let C[1..k] be a new array
2. for i=1 to k
3. C[i]=0
4. for j=1 to A.length
5. C[A[j]] = C[A[j]] + 1
6. // C[i] has the number of elements of A equal to i.
7. for i= 2 to k
8. C[i] = C[i] + C[i-1]
9. // C[i] has the number of elements of A that is at most i.
10. for j=A.length down to 1
11. B[C[A[j]]] = A[j]
12. C[A[j]] = C[A[j]] – 1
1 2 3 4 5
A
1 4 3 1 3
1 2 3 4 5 1 2 3 4
B C 2 0 2 1
Counting Sort
COUNTING-SORT(A, B, k)
1. let C[1..k] be a new array
2. for i=1 to k
3. C[i]=0
4. for j=1 to A.length
5. C[A[j]] = C[A[j]] + 1
6. // C[i] has the number of elements of A equal to i.
7. for i= 2 to k
8. C[i] = C[i] + C[i-1]
9. // C[i] has the number of elements of A that is at most i.
10. for j=A.length down to 1
11. B[C[A[j]]] = A[j]
12. C[A[j]] = C[A[j]] – 1
1 2 3 4 5
A
1 4 3 1 3
1 2 3 4 5 1 2 3 4
B C 2 2 2 1
Counting Sort
COUNTING-SORT(A, B, k)
1. let C[1..k] be a new array
2. for i=1 to k
3. C[i]=0
4. for j=1 to A.length
5. C[A[j]] = C[A[j]] + 1
6. // C[i] has the number of elements of A equal to i.
7. for i= 2 to k
8. C[i] = C[i] + C[i-1]
9. // C[i] has the number of elements of A that is at most i.
10. for j=A.length down to 1
11. B[C[A[j]]] = A[j]
12. C[A[j]] = C[A[j]] – 1
1 2 3 4 5
A
1 4 3 1 3
1 2 3 4 5 1 2 3 4
B C 2 2 4 1
Counting Sort
COUNTING-SORT(A, B, k)
1. let C[1..k] be a new array
2. for i=1 to k
3. C[i]=0
4. for j=1 to A.length
5. C[A[j]] = C[A[j]] + 1
6. // C[i] has the number of elements of A equal to i.
7. for i= 2 to k
8. C[i] = C[i] + C[i-1]
9. // C[i] has the number of elements of A that is at most i.
10. for j=A.length down to 1
11. B[C[A[j]]] = A[j]
12. C[A[j]] = C[A[j]] – 1
1 2 3 4 5
A
1 4 3 1 3
1 2 3 4 5 1 2 3 4
B C 2 2 4 5
Counting Sort
COUNTING-SORT(A, B, k)
1. let C[1..k] be a new array
2. for i=1 to k
3. C[i]=0
4. for j=1 to A.length
5. C[A[j]] = C[A[j]] + 1
6. // C[i] has the number of elements of A equal to i.
7. for i= 2 to k
8. C[i] = C[i] + C[i-1]
9. // C[i] has the number of elements of A that is at most i.
10. for j=A.length down to 1
11. B[C[A[j]]] = A[j]
12. C[A[j]] = C[A[j]] – 1
1 2 3 4 5
A
1 4 3 1 3
1 2 3 4 5 1 2 3 4
B C 2 2 4 5
Counting Sort
COUNTING-SORT(A, B, k)
1. let C[1..k] be a new array
2. for i=1 to k
3. C[i]=0
4. for j=1 to A.length
5. C[A[j]] = C[A[j]] + 1
6. // C[i] has the number of elements of A equal to i.
7. for i= 2 to k
8. C[i] = C[i] + C[i-1]
9. // C[i] has the number of elements of A that is at most i.
10. for j=A.length down to 1
11. B[C[A[j]]] = A[j]
12. C[A[j]] = C[A[j]] – 1
1 2 3 4 5
A
1 4 3 1 3
1 2 3 4 5 1 2 3 4
B C 2 2 4 5
Counting Sort
COUNTING-SORT(A, B, k)
1. let C[1..k] be a new array
2. for i=1 to k
3. C[i]=0
4. for j=1 to A.length
5. C[A[j]] = C[A[j]] + 1
6. // C[i] has the number of elements of A equal to i.
7. for i= 2 to k
8. C[i] = C[i] + C[i-1]
9. // C[i] has the number of elements of A that is at most i.
10. for j=A.length down to 1
11. B[C[A[j]]] = A[j]
12. C[A[j]] = C[A[j]] – 1
1 2 3 4 5
A
1 4 3 1 3
1 2 3 4 5 1 2 3 4
B C 2 2 4 5
Counting Sort
COUNTING-SORT(A, B, k)
1. let C[1..k] be a new array
2. for i=1 to k
3. C[i]=0
4. for j=1 to A.length
5. C[A[j]] = C[A[j]] + 1
6. // C[i] has the number of elements of A equal to i.
7. for i= 2 to k
8. C[i] = C[i] + C[i-1]
9. // C[i] has the number of elements of A that is at most i.
10. for j=A.length down to 1
11. B[C[A[j]]] = A[j]
12. C[A[j]] = C[A[j]] – 1
1 2 3 4 5
A
1 4 3 1 3
1 2 3 4 5 1 2 3 4
B 3
C 2 2 4 5
Counting Sort
COUNTING-SORT(A, B, k)
1. let C[1..k] be a new array
2. for i=1 to k
3. C[i]=0
4. for j=1 to A.length
5. C[A[j]] = C[A[j]] + 1
6. // C[i] has the number of elements of A equal to i.
7. for i= 2 to k
8. C[i] = C[i] + C[i-1]
9. // C[i] has the number of elements of A that is at most i.
10. for j=A.length down to 1
11. B[C[A[j]]] = A[j]
12. C[A[j]] = C[A[j]] – 1
1 2 3 4 5
A
1 4 3 1 3
1 2 3 4 5 1 2 3 4
B 3
C 2 2 3 5
Counting Sort
COUNTING-SORT(A, B, k)
1. let C[1..k] be a new array
2. for i=1 to k
3. C[i]=0
4. for j=1 to A.length
5. C[A[j]] = C[A[j]] + 1
6. // C[i] has the number of elements of A equal to i.
7. for i= 2 to k
8. C[i] = C[i] + C[i-1]
9. // C[i] has the number of elements of A that is at most i.
10. for j=A.length down to 1
11. B[C[A[j]]] = A[j]
12. C[A[j]] = C[A[j]] – 1
1 2 3 4 5
A
1 4 3 1 3
1 2 3 4 5 1 2 3 4
B 3
C 2 2 3 5
Counting Sort
COUNTING-SORT(A, B, k)
1. let C[1..k] be a new array
2. for i=1 to k
3. C[i]=0
4. for j=1 to A.length
5. C[A[j]] = C[A[j]] + 1
6. // C[i] has the number of elements of A equal to i.
7. for i= 2 to k
8. C[i] = C[i] + C[i-1]
9. // C[i] has the number of elements of A that is at most i.
10. for j=A.length down to 1
11. B[C[A[j]]] = A[j]
12. C[A[j]] = C[A[j]] – 1
1 2 3 4 5
A
1 4 3 1 3
1 2 3 4 5 1 2 3 4
B 3
C 2 2 3 5
Counting Sort
COUNTING-SORT(A, B, k)
1. let C[1..k] be a new array
2. for i=1 to k
3. C[i]=0
4. for j=1 to A.length
5. C[A[j]] = C[A[j]] + 1
6. // C[i] has the number of elements of A equal to i.
7. for i= 2 to k
8. C[i] = C[i] + C[i-1]
9. // C[i] has the number of elements of A that is at most i.
10. for j=A.length down to 1
11. B[C[A[j]]] = A[j]
12. C[A[j]] = C[A[j]] – 1
1 2 3 4 5
A
1 4 3 1 3
1 2 3 4 5 1 2 3 4
B 1 3
C 2 2 3 5
Counting Sort
COUNTING-SORT(A, B, k)
1. let C[1..k] be a new array
2. for i=1 to k
3. C[i]=0
4. for j=1 to A.length
5. C[A[j]] = C[A[j]] + 1
6. // C[i] has the number of elements of A equal to i.
7. for i= 2 to k
8. C[i] = C[i] + C[i-1]
9. // C[i] has the number of elements of A that is at most i.
10. for j=A.length down to 1
11. B[C[A[j]]] = A[j]
12. C[A[j]] = C[A[j]] – 1
1 2 3 4 5
A
1 4 3 1 3
1 2 3 4 5 1 2 3 4
B 1 3
C 1 2 3 5
Counting Sort
COUNTING-SORT(A, B, k)
1. let C[1..k] be a new array
2. for i=1 to k
3. C[i]=0
4. for j=1 to A.length
5. C[A[j]] = C[A[j]] + 1
6. // C[i] has the number of elements of A equal to i.
7. for i= 2 to k
8. C[i] = C[i] + C[i-1]
9. // C[i] has the number of elements of A that is at most i.
10. for j=A.length down to 1
11. B[C[A[j]]] = A[j]
12. C[A[j]] = C[A[j]] – 1
1 2 3 4 5
A
1 4 3 1 3
1 2 3 4 5 1 2 3 4
B 1 3
C 1 2 3 5
Counting Sort
COUNTING-SORT(A, B, k)
1. let C[1..k] be a new array
2. for i=1 to k
3. C[i]=0
4. for j=1 to A.length
5. C[A[j]] = C[A[j]] + 1
6. // C[i] has the number of elements of A equal to i.
7. for i= 2 to k
8. C[i] = C[i] + C[i-1]
9. // C[i] has the number of elements of A that is at most i.
10. for j=A.length down to 1
11. B[C[A[j]]] = A[j]
12. C[A[j]] = C[A[j]] – 1
1 2 3 4 5
A
1 4 3 1 3
1 2 3 4 5 1 2 3 4
B 1 3
C 1 2 3 5
Counting Sort
COUNTING-SORT(A, B, k)
1. let C[1..k] be a new array
2. for i=1 to k
3. C[i]=0
4. for j=1 to A.length
5. C[A[j]] = C[A[j]] + 1
6. // C[i] has the number of elements of A equal to i.
7. for i= 2 to k
8. C[i] = C[i] + C[i-1]
9. // C[i] has the number of elements of A that is at most i.
10. for j=A.length down to 1
11. B[C[A[j]]] = A[j]
12. C[A[j]] = C[A[j]] – 1
1 2 3 4 5
A
1 4 3 1 3
1 2 3 4 5 1 2 3 4
B 1 3 3
C 1 2 3 5
Counting Sort
COUNTING-SORT(A, B, k)
1. let C[1..k] be a new array
2. for i=1 to k
3. C[i]=0
4. for j=1 to A.length
5. C[A[j]] = C[A[j]] + 1
6. // C[i] has the number of elements of A equal to i.
7. for i= 2 to k
8. C[i] = C[i] + C[i-1]
9. // C[i] has the number of elements of A that is at most i.
10. for j=A.length down to 1
11. B[C[A[j]]] = A[j]
12. C[A[j]] = C[A[j]] – 1
1 2 3 4 5
A
1 4 3 1 3
1 2 3 4 5 1 2 3 4
B 1 3 3
C 1 2 2 5
Counting Sort
COUNTING-SORT(A, B, k)
1. let C[1..k] be a new array
2. for i=1 to k
3. C[i]=0
4. for j=1 to A.length
5. C[A[j]] = C[A[j]] + 1
6. // C[i] has the number of elements of A equal to i.
7. for i= 2 to k
8. C[i] = C[i] + C[i-1]
9. // C[i] has the number of elements of A that is at most i.
10. for j=A.length down to 1
11. B[C[A[j]]] = A[j]
12. C[A[j]] = C[A[j]] – 1
1 2 3 4 5
A
1 4 3 1 3
1 2 3 4 5 1 2 3 4
B 1 3 3
C 1 2 2 5
Counting Sort
COUNTING-SORT(A, B, k)
1. let C[1..k] be a new array
2. for i=1 to k
3. C[i]=0
4. for j=1 to A.length
5. C[A[j]] = C[A[j]] + 1
6. // C[i] has the number of elements of A equal to i.
7. for i= 2 to k
8. C[i] = C[i] + C[i-1]
9. // C[i] has the number of elements of A that is at most i.
10. for j=A.length down to 1
11. B[C[A[j]]] = A[j]
12. C[A[j]] = C[A[j]] – 1
1 2 3 4 5
A
1 4 3 1 3
1 2 3 4 5 1 2 3 4
B 1 3 3
C 1 2 2 5
Counting Sort
COUNTING-SORT(A, B, k)
1. let C[1..k] be a new array
2. for i=1 to k
3. C[i]=0
4. for j=1 to A.length
5. C[A[j]] = C[A[j]] + 1
6. // C[i] has the number of elements of A equal to i.
7. for i= 2 to k
8. C[i] = C[i] + C[i-1]
9. // C[i] has the number of elements of A that is at most i.
10. for j=A.length down to 1
11. B[C[A[j]]] = A[j]
12. C[A[j]] = C[A[j]] – 1
1 2 3 4 5
A
1 4 3 1 3
1 2 3 4 5 1 2 3 4
B 1 3 3 4
C 1 2 2 5
Counting Sort
COUNTING-SORT(A, B, k)
1. let C[1..k] be a new array
2. for i=1 to k
3. C[i]=0
4. for j=1 to A.length
5. C[A[j]] = C[A[j]] + 1
6. // C[i] has the number of elements of A equal to i.
7. for i= 2 to k
8. C[i] = C[i] + C[i-1]
9. // C[i] has the number of elements of A that is at most i.
10. for j=A.length down to 1
11. B[C[A[j]]] = A[j]
12. C[A[j]] = C[A[j]] – 1
1 2 3 4 5
A
1 4 3 1 3
1 2 3 4 5 1 2 3 4
B 1 3 3 4
C 1 2 2 4
Counting Sort
COUNTING-SORT(A, B, k)
1. let C[1..k] be a new array
2. for i=1 to k
3. C[i]=0
4. for j=1 to A.length
5. C[A[j]] = C[A[j]] + 1
6. // C[i] has the number of elements of A equal to i.
7. for i= 2 to k
8. C[i] = C[i] + C[i-1]
9. // C[i] has the number of elements of A that is at most i.
10. for j=A.length down to 1
11. B[C[A[j]]] = A[j]
12. C[A[j]] = C[A[j]] – 1
1 2 3 4 5
A
1 4 3 1 3
1 2 3 4 5 1 2 3 4
B 1 3 3 4
C 1 2 2 4
Counting Sort
COUNTING-SORT(A, B, k)
1. let C[1..k] be a new array
2. for i=1 to k
3. C[i]=0
4. for j=1 to A.length
5. C[A[j]] = C[A[j]] + 1
6. // C[i] has the number of elements of A equal to i.
7. for i= 2 to k
8. C[i] = C[i] + C[i-1]
9. // C[i] has the number of elements of A that is at most i.
10. for j=A.length down to 1
11. B[C[A[j]]] = A[j]
12. C[A[j]] = C[A[j]] – 1
1 2 3 4 5
A
1 4 3 1 3
1 2 3 4 5 1 2 3 4
B 1 3 3 4
C 1 2 2 4
Counting Sort
COUNTING-SORT(A, B, k)
1. let C[1..k] be a new array
2. for i=1 to k
3. C[i]=0
4. for j=1 to A.length
5. C[A[j]] = C[A[j]] + 1
6. // C[i] has the number of elements of A equal to i.
7. for i= 2 to k
8. C[i] = C[i] + C[i-1]
9. // C[i] has the number of elements of A that is at most i.
10. for j=A.length down to 1
11. B[C[A[j]]] = A[j]
12. C[A[j]] = C[A[j]] – 1
1 2 3 4 5
A
1 4 3 1 3
1 2 3 4 5 1 2 3 4
B 1 1 3 3 4
C 1 2 2 4
Counting Sort
COUNTING-SORT(A, B, k)
1. let C[1..k] be a new array
2. for i=1 to k
3. C[i]=0
4. for j=1 to A.length
5. C[A[j]] = C[A[j]] + 1
6. // C[i] has the number of elements of A equal to i.
7. for i= 2 to k
8. C[i] = C[i] + C[i-1]
9. // C[i] has the number of elements of A that is at most i.
10. for j=A.length down to 1
11. B[C[A[j]]] = A[j]
12. C[A[j]] = C[A[j]] – 1
1 2 3 4 5
A
1 4 3 1 3
1 2 3 4 5 1 2 3 4
B 1 1 3 3 4
C 0 2 2 4
Counting Sort
COUNTING-SORT(A, B, k)
1. let C[1..k] be a new array
2. for i=1 to k
3. C[i]=0
4. for j=1 to A.length
5. C[A[j]] = C[A[j]] + 1
6. // C[i] has the number of elements of A equal to i.
7. for i= 2 to k
8. C[i] = C[i] + C[i-1]
9. // C[i] has the number of elements of A that is at most i.
10. for j=A.length down to 1
11. B[C[A[j]]] = A[j]
12. C[A[j]] = C[A[j]] – 1
1 2 3 4 5
A
1 4 3 1 3
1 2 3 4 5 1 2 3 4
B 1 1 3 3 4
C 0 2 2 4
Property of Counting Sort
It is not a comparison sort.
No comparisons between input elements occur anywhere in the
code.
The Ω 𝑛 lg 𝑛 lower bound for sorting does not apply when we
depart from the comparison sort model.
The property of stability is important when satellite data are
carried around with the element being sorted.
The numbers with the same value appear in the output array in the
same order as in the input array.
Radix Sort
Counting Sort works only for small integers.
Radix Sort sorts the numbers one digit at a time.
Each input has d decimal one digits (or digits in any base)
We use a stable sorting algorithm like Counting Sort
We sort repetitively, starting from the lowest order digit finishing at
the highest digit.
Since the sorting algorithm is stable, if the numbers are sorted with
respect low order digits and are later sorted with high order digits,
numbers having the same high order digit will still remain sorted
w.r.t their low order digit.
Radix Sort
RADIX-SORT(A,d)
1. for i=1 to d
2. use a stable sort to sort the array A on digit i
Running Time : 𝑂 𝑑 × 𝑛 + 𝑘 =𝑂 𝑛
(𝑘 : number of values that a digit can have)
Radix Sort
Lemma 8.3
Given 𝑛 𝑑-digit numbers in which each digit can take on up to k possible
values, RADIX-SORT correctly sorts these numbers in Θ 𝑑(𝑛 + 𝑘) time, if
the stable sort takes Θ 𝑛 + 𝑘 time.
Proof
The correctness of radix sort follows by induction on the column being
sorted (see Exercise 8.3-3).
The analysis of the running time depends on the stable sort used as the
intermediate sorting algorithm.
When each digit is in the range 0 to k-1 (so that it can take on k possible
values), and k is not too large, counting sort is the obvious choice.
Each pass over 𝑛 d-digit numbers then takes Θ 𝑛 + 𝑘 time.
There are d passes, and so radix sort is Θ 𝑑(𝑛 + 𝑘) time.
Radix Sort
Lemma 8.4
Given 𝑛 𝑏-bit numbers and any positive integer 𝑟 ≤ 𝑏, RADIX-SORT
correctly sorts in Θ 𝑏Τ𝑟 𝑛 + 2𝑟 time, if the stable sort takes
Θ 𝑛 + 𝑘 time for inputs in the range 0 to k.
Proof
For a value r ≤ 𝑏, we view each key as having 𝑑 = 𝑏/𝑟 digits of
r bits each.
Each digit is an integer 0 to 2r – 1, so that we can use counting
sort with k=2r – 1.
e.g.) A 32-bit word has four 8-bit digits - 𝑏 = 32, 𝑟 = 8, 𝑘 = 2𝑟 −
1 = 255, 𝑑 = 𝑏/𝑟 = 4.
Each pass of counting sort Θ 𝑛 + 𝑘 = Θ 𝑛 + 2𝑟 , and there are 𝑑
passes.
Total running time Θ 𝑑 𝑛 + 2𝑟 = Θ 𝑏Τ𝑟 𝑛 + 2𝑟 .
Bucket Sort
Assumes that the 𝑛 input numbers are drawn from a uniform
distribution.
Like counting sort, it is fast because it assumes that the input is
generated by a random process that distributes elements uniformly and
independently over the interval [0,1).
Average-case running time is 𝑂(𝑛).
Bucket Sort
Divide the interval [0,1) into 𝑛 equal-sized subintervals (buckets).
Distributes the 𝑛 input numbers into the buckets.
Sort the numbers in each bucket and go through the buckets in order.
Bucket Sort
BUCKET-SORT(A)
1. let B[0…n-1] be a new array
2. n=A.length
3. for i=0 to n-1
4. make B[i] an empty list
5. for i=1 to n
6. insert A[i] into list B[ 𝑛𝐴 𝑖 ]
7. for i=0 to n-1
8. sort list B[i] with insertion sort
9. concatenate the list B[0],B[1],…,B[n-1] together in order
Bucket Sort
A B
0 .78 0 /
1 .17 1 .12 .17 /
2 .39 2 .21 .23 .26 /
3 .26 3 .39 /
4 .72 4 /
5 .94 5 /
6 .21 6 .68 /
7 .12 7 .72 .78 /
8 .23 8 /
9 .68 9 .94 /
Analysis of Bucket Sort
Let ni be the random variable denoting the number of elements placed
in bucket B[i]
Since the insertion sort runs in quadratic time, the running time of
bucket sort is
𝑛−1
𝑇 𝑛 = Θ 𝑛 + 𝑂 𝑛𝑖2
𝑖=0
Analysis of Bucket Sort
We now analyze the average-case running time of bucket sort, by computing
the expected value of the running time, where we take the expectation over the
input distribution.
Taking expectations of both sides of
𝑛−1
𝑇 𝑛 = Θ 𝑛 + 𝑂 𝑛𝑖2
𝑖=0
𝐸𝑇 𝑛 = 𝐸 Θ 𝑛 + 𝑂 𝑛𝑖2
𝑖=0
𝑛−1 𝑛−1
= Θ 𝑛 + 𝐸 𝑂 𝑛𝑖2 = Θ 𝑛 + 𝑂 𝐸 𝑛𝑖2
𝑖=0 𝑖=0
Analysis of Bucket Sort
𝑛−1
𝑇 𝑛 = Θ 𝑛 + 𝑂 𝑛𝑖2
𝑖=0
𝑛−1 𝑛−1 𝑛−1
2
E 𝑛𝑖2 = E σ𝑛𝑗=1 𝑋𝑖𝑗 = E σ𝑛𝑗=1 σ𝑛𝑘=1 𝑋𝑖𝑗 𝑋𝑖𝑘
= E σ𝑛𝑗=1 𝑋𝑖𝑗
2
+ σ𝑛𝑗=1 σ1≤𝑘≤𝑛 𝑋𝑖𝑗 𝑋𝑖𝑘
𝑘≠𝑗
= σ𝑛𝑗=1 E 𝑋𝑖𝑗
2
+ σ𝑛𝑗=1 σ1≤𝑘≤𝑛 E 𝑋𝑖𝑗 𝑋𝑖𝑘 (8.3)
𝑘≠𝑗
Analysis of Bucket Sort
Indicator random variable Xij is 1 with probability 1/n and 0 otherwise.
Thus,
2
E 𝑋𝑖𝑗 = 12 ∙ 1Τ𝑛 + 02 ∙ 1 − 1Τ𝑛 = 1Τ𝑛
When k ≠ j , the variables Xij and Xik are independent, and hence
E 𝑋𝑖𝑗 𝑋𝑖𝑘 = E Xij E Xik = 1Τ𝑛 ∙ 1Τ𝑛 = 1Τ𝑛2
Analysis of Bucket Sort
Substituting these two expected values in equation (8.3), we obtain
𝑛 𝑛
= 1Τ 𝑛 + 1Τ𝑛 2
𝑗=1 𝑗=1 1≤𝑘≤𝑛
𝑘≠𝑗
= n ∙ 1Τ𝑛 + 𝑛 𝑛 − 1 ∙ 1Τ𝑛 2
= 1 + 𝑛 − 1 Τ𝑛 = 2 − 1Τ𝑛
Analysis of Bucket Sort
𝑋𝑖𝑗 = I{𝐴 𝑗 falls in bucket 𝑖} 𝑛𝑖 = σ𝑛𝑗=1 𝑋𝑖𝑗
2
E 𝑛𝑖2 = E σ𝑛𝑗=1 𝑋𝑖𝑗 = E σ𝑛𝑗=1 σ𝑛𝑘=1 𝑋𝑖𝑗 𝑋𝑖𝑘
= E σ𝑛𝑗=1 𝑋𝑖𝑗
2
+ σ𝑛𝑗=1 σ1≤𝑘≤𝑛 𝑋𝑖𝑗 𝑋𝑖𝑘
𝑘≠𝑗
= σ𝑛𝑗=1 E 𝑋𝑖𝑗
2
+ σ𝑛𝑗=1 σ1≤𝑘≤𝑛 E 𝑋𝑖𝑗 𝑋𝑖𝑘
𝑘≠𝑗
1 𝑝 = 1/𝑛
Indicator random variable 𝑋𝑖𝑗 = ቊ
0 otherwise
2
E 𝑋𝑖𝑗 = 12 ∙ 1Τ𝑛 + 02 ∙ 1 − 1Τ 𝑛 = 1Τ𝑛 𝑋𝑖𝑗 𝑎𝑛𝑑 𝑋𝑖𝑘 are independent