0% found this document useful (0 votes)
20 views29 pages

2100 2122 8 Sorting in Linear Time

CUHK CSCI2100

Uploaded by

findkellyho
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
20 views29 pages

2100 2122 8 Sorting in Linear Time

CUHK CSCI2100

Uploaded by

findkellyho
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 29

Sorting in Linear Time

Can We Sort in Linear Time?


• Merge sort, heapsort and quicksort are all
comparison sorts.
• Recall that any comparison sorts must make
O(n log n) comparisons in worst cases.
• To sort faster than this lower bound, we try to sort
without any comparisons.
• Examples of such algorithms : bucket Sort, counting
sort, radix sort.

Page 2
Facts About Counting Sort
• It is not a comparison sort.
• Restricts elements in range from 0 to K.
• Runs in linear time when K = O(N).
•  Stable
• It is often used as a subroutine in radix sort.
•  Does NOT sort in place.

Page 3
Step 1: Counting
• We prepare an array C for counting the occurrences
of each element in range 0 to K.
• The length of the array required is K.
• We go through each element stored in the original
array A and calculate the count, saving in C.

0 5 3 2 3 0 5 1 5 2
A

0 1 2 3 4 5
C 2 1 2 2 0 3

Page 4
Step 2: Accumulation
• Compute the accumulative count on C, giving C’.
• The count in C'[i] is the number of elements that are
equal or smaller than i.

C 2 1 2 2 0 3
0 1 2 3 4 5

C’ 2 3 5 7 7 10
0 1 2 3 4 5
Page 5
Step 3: Place Elements
• The last step is put the elements in A in the sorted order in a
new array B using C'.
• We process each element in A in reversed order (we'll see
why later).
• Lookup the position j from C’ and copy the value of the
element to B[j – 1].
• Copy contents of B to A to finalize the sorting.

A 0 5 3 2 3 0 5 1 5 2

C’ 2 3 5 7 7 10
0 1 2 3 4 5
B B[0] B[1] B[2] B[3] B[4] B[5] B[6] B[7] B[8] B[9]
Page 6 0 0 1 2 2 3 3 5 5 5
Step 3: Place Elements
A 0 5 3 2 3 0 5 1 5 2

C’ 2 3 54 7 7 10
0 1 2 3 4 5

B[0] B[1] B[2] B[3] B[4] B[5] B[6] B[7] B[8] B[9]
B
2
Page 7
Step 3: Place Elements
A 0 5 3 2 3 0 5 1 5 2

C’ 2 3 54 7 7 109
0 1 2 3 4 5

B[0] B[1] B[2] B[3] B[4] B[5] B[6] B[7] B[8] B[9]
B
2 5
Page 8
Step 3: Place Elements
A 0 5 3 2 3 0 5 1 5 2

C’ 2 32 54 7 7 109
0 1 2 3 4 5

B[0] B[1] B[2] B[3] B[4] B[5] B[6] B[7] B[8] B[9]
B
1 2 5
Page 9
Step 3: Place Elements
A 0 5 3 2 3 0 5 1 5 2

C’ 2 32 54 7 7 98
0 1 2 3 4 5

B[0] B[1] B[2] B[3] B[4] B[5] B[6] B[7] B[8] B[9]
B
0 1 2 5 5
Page 10
Step 3: Place Elements
A 0 5 3 2 3 0 5 1 5 2

C’ 21 32 54 7 7 98
0 1 2 3 4 5

B[0] B[1] B[2] B[3] B[4] B[5] B[6] B[7] B[8] B[9]
B
0 1 2 5 5
Page 11
Step 3: Place Elements
A 0 5 3 2 3 0 5 1 5 2

C’ 10 32 43 765 7 87
0 1 2 3 4 5

B[0] B[1] B[2] B[3] B[4] B[5] B[6] B[7] B[8] B[9]
B
0 0 1 2 2 3 3 5 5 5
Page 12
Counting Sort: Code
• Predict the value of K or find it by checking the elements in A.
• Line 7-8: counting O(N); line 9-10: accumulation O(K)
• Line 11-14: place and sort O(N)

1 #define k 10
2
3 void csort(int n, int *a, int *b, int k){
4 int i;
5 int *c = (int *)calloc(k, sizeof(int));
6
7 for (i = 0; i < n; i++) Step 1
8 c[a[i]]++;
9 for (i = 1; i < k; i++) Step 2
10 c[i] = c[i] + c[i - 1];
11 for (i = n - 1; i >= 0; i--)
Step 3
12 b[--c[a[i]]] = a[i];
13 free(c);
14 }
Page 13
Step 1: Counting
for (i = 0; i < n; i++)
c[a[i]]++;

0 5 3 2 3 0 5 1 5 2
A

0 1 2 3 4 5
C 2 1 2 2 0 3

Page 14
Step 2: Accumulation
for (i = 1; i < k; i++)
c[i] = c[i] + c[i - 1];

C 2 1 2 2 0 3
0 1 2 3 4 5

C’ 2 3 5 7 7 10
0 1 2 3 4 5
Page 15
Step 3: Place Elements
for (i = n - 1; i >= 0; i--)
b[--c[a[i]]] = a[i];

A 0 5 3 2 3 0 5 1 5 2

C’ 2 3 5 7 7 10
0 1 2 3 4 5
B B[0] B[1] B[2] B[3] B[4] B[5] B[6] B[7] B[8] B[9]
Page 16 0 0 1 2 2 3 3 5 5 5
Counting Sort: Code (2)
• We need a driver function to prepare B and copy B back
to A.
• If you need to calculate K, you can put your code here.
• The complexity of counting sort is O(N + K).
• If K = O(N), then T(N) = O(N).
16 void countingsort(int n, int *a){
17 int i;
18 int *b = (int *)malloc(n * sizeof(int));
19
20 csort(n, a, b, K);
21
22 for (i = 0; i < n; i++)
23 a[i] = b[i];
24
25 free(b);
26 }

Page 17
Stability
• Elements with same keys appear in the output array in
the same (relative) order.
• This is why we process the array A in reversed order.
• This is an important property when we use counting
sort as subroutine in radix sort.

A 01 51 31 21 32 02 52 1 53 22

B
stable
01 02 1 21 22 31 32 51 52 53

for (i = n - 1; i >= 0; i--)


b[--c[a[i]]] = a[i];

Page 18
Radix Sort 5354
• Human sorts numbers digit by digit and words
alphabet by alphabet. MSD LSD

• Idea: most significant digits (MSD) dominate less


significant digits (LSD).
• Don't forget that you still have to sort against every
digit to get the sorting job done.
• Question: Sort MSD first or sort LSD first?

Imagine: you have a set of number cards, when you sort by the more
significant digits first, you need to put aside the sorted piles. This creates
problem when you program in a similar fashion.

Page 19
Radix Sort (2)
• LSD radix sort solves the problem of card sorting on the least
significant digit first.
• The cards in bin 0 (smaller digits) precedes the cards in the bin 1.
• The entire deck is sorted again on the second-least significant digit.
• The process continues until the cards have been sorted on all D digits.
•  Important: The digit sort must be stable.

Page 20
Radix Sort: Example

Original Digit 0
329 720
457 355
657 436
839 457
436 657
720 329
355 839
Page 21
Radix Sort: Example

Original Digit 0 Digit 1


329 720 720
457 355 329
657 436 436
839 457 839
436 657 355
720 329 457
355 839 657
Page 22
Radix Sort: Example

Original Digit 0 Digit 1 Digit 2


329 720 720 329
457 355 329 355
657 436 436 436
839 457 839 457
436 657 355 657
720 329 457 720
355 839 657 839
Page 23
Radix Sort: Code
• For illustration (ONLY), we code radix sort on base 10.
• In practice, we would use a group of bits instead, so that the
digit extraction can be done using bitwise operators.

5 #define R 10
6 #define D 3
...
36 void radixsort(int n, int *a){
37 int i;
38
39 for (i = 0; i < D; i++){
40 csort(n, a, i);
41 printf("i=%d\n", i);
42 print_array(n, a);
43 }
44 }

Page 24
Radix Sort: Code (2)
• We modify the original counting sort so that we can consider a specified
digit to be the key.
9 void csort(int n, int *a, int d){ 1 #define k 10
10 int i, r = 1;
2
11 int *b = (int *)malloc(n * sizeof(int));
12 int *c = (int *)calloc(R, sizeof(int)); 3 void csort(int n, int *a, int *b, int k){
13 4 int i;
14 for (i = 0; i < d; i++) 5 int *c = (int *)calloc(k, sizeof(int));
15 r *= R; // calculate R^d 6
16 for (i = 0; i < n; i++) Step 1 7 for (i = 0; i < n; i++)
17 c[(a[i] / r) % R]++; // counting 8 c[a[i]]++;
18 for (i = 1; i < R; i++) 9 for (i = 1; i < k; i++)
19 c[i] = c[i] + c[i - 1]; Step 2
10 c[i] = c[i] + c[i - 1];
20 for (i = n - 1; i >= 0; i--)
Step 3 11 for (i = n - 1; i >= 0; i--)
21 b[--c[(a[i] / r) % R]] = a[i];
22 for (i = 0; i < n; i++) 12 b[--c[a[i]]] = a[i];
23 a[i] = b[i]; 13 free(c);
24 14 }
25/6 free(b); free(c);
27 }
Original

Page 25
Radix Sort: Code (2)
• We modify the original counting sort so that we can consider a specified
digit to be the key.
Sorting using the dth digit, d = 0, 1, 2
9 void csort(int n, int *a, int d){
10 int i, r = 1;
11 int *b = (int *)malloc(n * sizeof(int)); d = 0, R = 10, r = 1
12 int *c = (int *)calloc(R, sizeof(int)); d = 1, R =10, r = 10
13
14 for (i = 0; i < d; i++)
d = 2, R = 10, r = 100
15 r *= R; // calculate R^d
16 for (i = 0; i < n; i++) d=0, a[i]= 720, r = 1, R = 10
Step 1  ( a[i] / r ) % R = 0
17 c[(a[i] / r) % R]++; // counting
18 for (i = 1; i < R; i++)
19 c[i] = c[i] + c[i - 1]; Step 2
d=1, a[i]= 720, r = 10, R = 10
20 for (i = n - 1; i >= 0; i--)
21 b[--c[(a[i] / r) % R]] = a[i]; Step 3  ( a[i] / r ) % R = 2
22 for (i = 0; i < n; i++)
23 a[i] = b[i]; d=2, a[i]= 720, r = 100, R = 10
24  ( a[i] / r ) % R = 7
25/6 free(b); free(c);
27 }

Page 26
PROOF

Radix Sort: Correctness


Suppose the list of numbers LD are sorted on digits D, D - 1, ..., 1, then we
sort LD using a stable sorting algorithm on digit D + 1.

Consider two numbers a and b on LD with a precedes b, then we know the


last d digits of a is less than or equal to that of b (by assumption).

Consider the (D + 1)-digit of a and b,


If aD+1 > bD+1, after this round, a will be placed after b.
If aD+1 = bD+1, after this round, a precedes b (by stability).
If aD+1 < bD+1, after this round, a precedes b (by D + 1 digit)

Therefore, LD+1 is sorted on digit D + 1, D, ..., 1.

By induction, radix sort is correct.

Page 27
Radix Sort: Analysis
• Given N D-digits number in which each digit can take on up to
K possible values, radix sort correctly sorts these numbers in
O(D(N + K)).
• There are D counting sorts which each takes O(N + K).
• However, in general, it is hard to bound the numbers by the
number of digits.
• It is easier to express in terms of bits as we use binary
representation for numbers in computers.

Given N B-bits numbers and any positive integer R ≤ B, radix sort


sorts in O((B/R)(N + 2R)).
Example: 32-bit word has 4 8-bit digits.
B = 32, R = 8, K = 2R - 1 = 255, and D = B/R = 4.
11010110 00101001 00111000 11110101
Page 28
Summary
• Counting sort : sort by counting frequencies of the
keys. Stable but does not sort in-place. Θ(N + K)
• Radix sort: execute stable sort digit by digit. Begin
with least significant digit requires little memory
overheads. Θ((B/R)(N + 2R))

Page 29

You might also like