0% found this document useful (0 votes)
48 views

Sorting in Linear Time

Counting sort, radix sort, and bucket sort are sorting algorithms that can sort in linear time under certain conditions on the input values. Counting sort works by counting the number of objects that have each distinct key value, and using arithmetic to determine the positions for objects in the output array. Radix sort iterates through each digit of multi-digit numbers to perform a series of counting sorts. Bucket sort works by distributing objects into "buckets" based on their hashed values and then sorting each bucket.

Uploaded by

Asim Ullah
Copyright
© © All Rights Reserved
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
48 views

Sorting in Linear Time

Counting sort, radix sort, and bucket sort are sorting algorithms that can sort in linear time under certain conditions on the input values. Counting sort works by counting the number of objects that have each distinct key value, and using arithmetic to determine the positions for objects in the output array. Radix sort iterates through each digit of multi-digit numbers to perform a series of counting sorts. Bucket sort works by distributing objects into "buckets" based on their hashed values and then sorting each bucket.

Uploaded by

Asim Ullah
Copyright
© © All Rights Reserved
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
You are on page 1/ 43

Chapter 8-2: Sorting in Linear

Time

1
About this lecture
• Sorting algorithms we studied so far
– Insertion, Selection, Merge, Quicksort
 determine sorted order by comparison

• We will look at 3 new sorting algorithms


– Counting Sort, Radix Sort, Bucket Sort
 assume some properties on the input, and
determine the sorted order by distribution

2
Counting Sort

3
Counting Sort extra info
on values
• Input: Array A[1..n] of n integers,
each has value from [0,k]
• Output: Sorted array of the n integers
• Idea 1: Create B[1..n] to store the output
• Idea 2: Process A[1..n] from right to left
• Use k + 2 counters:
• One for “which element to process”
• k + 1 for “where to place”
4
Counting Sort (Details)
Before Running

A
2 1 2 5 3 3 1 2
k+1 counters

c[0], c[1], next element


c[2], c[3],
c[4], c[5]
B

5
Counting Sort (Details)
Step 1: Set c[j] = location in B for placing the
next element if it has value j
A
2 1 2 5 3 3 1 2

next element
c[2] = 5 c[5] = 8

c[0] = 0 c[1] = 2 c[3] = 7 c[4] = 7


6
Counting Sort (Details)
Step 2: Process next element of A and
update corresponding counter
A
2 1 2 5 3 3 1 2

next element
c[2] = 4 c[5] = 8

B
2

c[0] = 0 c[1] = 2 c[3] = 7 c[4] = 7


7
Counting Sort (Details)
Step 2: Process next element of A and
update corresponding counter
A
2 1 2 5 3 3 1 2

next element
c[2] = 4 c[5] = 8

B
1 2

c[0] = 0 c[1] = 1 c[3] = 7 c[4] = 7


8
Counting Sort (Details)
Step 2: Process next element of A and
update corresponding counter
A
2 1 2 5 3 3 1 2

next element
c[2] = 4 c[5] = 8

B
1 2 3

c[0] = 0 c[1] = 1 c[3] = 6 c[4] = 7


9
Counting Sort (Details)
Step 2: Process next element of A and
update corresponding counter
A
2 1 2 5 3 3 1 2

next element
c[2] = 4 c[5] = 8

B
1 2 3 3

c[0] = 0 c[1] = 1 c[3] = 5 c[4] = 7


10
Counting Sort (Details)
Step 2: Process next element of A and
update corresponding counter
A
2 1 2 5 3 3 1 2

next element
c[2] = 4 c[5] = 7

B
1 2 3 3 5

c[0] = 0 c[1] = 1 c[3] = 5 c[4] = 7


11
Counting Sort (Details)
Step 2: Process next element of A and
update corresponding counter
A
2 1 2 5 3 3 1 2

next element
c[2] = 3 c[5] = 7

B
1 2 2 3 3 5

c[0] = 0 c[1] = 1 c[3] = 5 c[4] = 7


12
Counting Sort (Details)
Step 2: Process next element of A and
update corresponding counter
A
2 1 2 5 3 3 1 2

next element
c[2] = 3 c[5] = 7

B
1 1 2 2 3 3 5

c[0] = 0 c[1] = 0 c[3] = 5 c[4] = 7


13
Counting Sort (Details)
Step 2: Done when all elements of A are processed

A
2 1 2 5 3 3 1 2

next element
c[2] = 3 c[5] = 7

B
1 1 2 2 2 3 3 5

c[0] = 0 c[1] = 0 c[3] = 5 c[4] = 7


14
Counting Sort (Step 1)
How can we perform Step 1 smartly?
1. Initialize c[0], c[1], …, c[k] to 0
2. /* First, set c[j] = # elements with value j */
For x = 1,2,…,n, increase c[A[x]] by 1
3. /* Set c[j] = location in B to place next element
whose value is j (iteratively) */
For y = 1,2,…,k, c[y] = c[y-1] + c[y]
Time for Step 1 = ( n + k )
15
Counting Sort (Step 2)
How can we perform Step 2 ?
/* Process A from right to left */
For x = n, n-1,…,2, 1
{ /* Process next element */
B[c[A[x]]] = A[x];
/* Update counter */
Decrease c[A[x]] by 1;
}
Time for Step 2 = ( n )
16
Counting Sort (Running Time)
Conclusion:
• Running time = ( n + k )
 if k = ( n ), time is (asymptotically) optimal
• Counting sort is also stable :
• elements with same value appear in same
order in before and after sorting

17
Stable Sort
Before
2 1 2 5 3 3 1 2
Sorting

After
Sorting 1 1 2 2 2 3 3 5

18
Radix Sort

19
Radix Sort extra info
on values
• Input: Array A[1..n] of n integers,
each has d digits, and
each digit has value from
[0,k]
• Output: Sorted array of the n integers
• Idea: Sort in d rounds
• At Round j, stable sort A on digit j
(where rightmost digit = digit 1)

20
Radix Sort (Example Run)
Before Running

1904
2579
1874
6355
4432
8318
1304

4 digits

21
Radix Sort (Example Run)
Round 1: Stable sort digit 1

1904 4432
2579 1904
1874 1874
6355 1304
4432 6355
831 8 831 8
1304 2579

22
Radix Sort (Example Run)
Round 2: Stable sort digit 2

4432 1903
1903 1304
1874 8318
1304 4432
6355 6355
831 8 1874
2579 2579

After Round 2, last 2 digits


are sorted (why?)
23
Radix Sort (Example Run)
Round 3: Stable sort digit 3

1903 1304
1304 8318
8318 6355
4432 4432
6355 2579
1874 1874
2579 1903

After Round 3, last 3 digits


are sorted (why?)
24
Radix Sort (Example Run)
Round 4: Stable sort digit 4

1304 1304
8318 1874
63 5 5 1903
4432 2579
2579 4432
1874 63 5 5
1903 8318

After Round 4, last 4 digits


are sorted (why?)
25
Radix Sort (Example Run)
Done when all digits are processed

1304
1874
1904
2579
4432
63 5 5
8318

The array is sorted (why?)

26
Radix Sort (Correctness)
Question:
“After r rounds, last r digits are sorted”
Why ??
Answer:
This can be proved by induction :
The statement is true for r = 1
Assume the statement is true for r = k
Then …
27
Radix Sort (Correctness)
At Round k+1,
• if two numbers differ in digit “k+1”, their
relative order [based on last k+1 digits] will be
correct after sorting digit “k+1”
• if two numbers match in digit “k+1”, their
relative order [based on last k+1 digits] will be
correct after stable sorting digit “k+1” (why?)

 Last “k+1” digits sorted after Round “k+1”

28
Radix Sort (Summary)
Conclusion:
• After d rounds, last d digits are sorted,
so that the numbers in A[1..n] are sorted
• There are d rounds of stable sort, each
can be done in ( n + k ) time
 Running time = ( d (n + k) )
• if d=(1) and k=(n), asymptotically optimal

29
Bucket Sort

30
Bucket Sort extra info
on values
• Input: Array A[1..n] of n elements,
each is drawn uniformly at
random from the
interval [0,1)
• Output: Sorted array of the n elements
• Idea:
Distribute elements into n buckets, so
that each bucket is likely to have fewer
elements  easier to sort
31
Bucket Sort (Details)
Before 0.78, 0.17, 0.39, 0.26, 0.72,
Running 0.94, 0.21, 0.12, 0.23, 0.68

Step 1:
Create n
buckets [0,0.1) [0.1,0.2) [0.2,0.3) [0.3,0.4) [0.4,0.5)

n = #buckets
= #elements
[0.5,0.6) [0.6,0.7) [0.7,0.8) [0.8,0.9) [0.9,1)

each bucket represents a


subinterval of size 1/n 32
Bucket Sort (Details)
Step 2: Distribute each element to correct bucket

0.17 0.26
0.21 0.39
0.12 0.23
[0,0.1) [0.1,0.2) [0.2,0.3) [0.3,0.4) [0.4,0.5)

0.78 0.94
0.68
0.72
[0.5,0.6) [0.6,0.7) [0.7,0.8) [0.8,0.9) [0.9,1)

If Bucket j represents subinterval [ j/n, (j+1)/n ),


element with value x should be in Bucket xn
33
Bucket Sort (Details)
Step 3: Sort each bucket (by insertion sort)

0.21
0.12 0.23
0.17 0.39
0.26
[0,0.1) [0.1,0.2) [0.2,0.3) [0.3,0.4) [0.4,0.5)

0.72 0.94
0.68 0.78
[0.5,0.6) [0.6,0.7) [0.7,0.8) [0.8,0.9) [0.9,1)

34
Bucket Sort (Details)
Step 4: Collect elements from Bucket 0 to Bucket n-1

0.21
0.12 0.23
0.17 0.39
0.26
[0,0.1) [0.1,0.2) [0.2,0.3) [0.3,0.4) [0.4,0.5)

0.72 0.94
0.68 0.78
[0.5,0.6) [0.6,0.7) [0.7,0.8) [0.8,0.9) [0.9,1)

Sorted 0.12, 0.17, 0.21, 0.23,


Output 0.26,
0.39, 0.68, 0.72, 0.78, 0.94 35
Bucket Sort (Running Time)
• Let X = # comparisons in all insertion sort
( n + X ) varies on input
Running time =
 worst-case running time = ( n2 )

• How about average running time?

Finding average of X (i.e. #comparisons)


gives average running time

36
Average Running Time
Let nj = # elements in Bucket j

X ≤ c( n02 + n12 + … + nn-12 )

varies on input
So, E[X] ≤ E[c(n02 + n12 + … + nn-12)]
= c E[n02 + n12 + … + nn-12]
= c (E[n02] + E[n12] + … + E[nn-12])
= cn E[n02] (by uniform
distribution) 37
Average Running Time
Textbook (pages 202-203) shows that
E[n02] = 2 – (1/n)
 E[X] ≤ cn E[n02] = 2cn – c
In other words, E[X] = ( n )
 Average running time = ( n )

38
For Interested Classmates
The following is how we can show
E[n02] = 2 – (1/n)

Recall that n0 = # elements in Bucket 0


So, suppose we set
Yk = 1 if element k is in Bucket 0
Yk = 0 if element k not in Bucket 0

Then, n0 = Y1 + Y2 + … + Yn
39
For Interested Classmates
Then,
E[n02] = E[(Y1 + Y2 + … + Yn)2]
= E[ Y12 + Y22 + … + Yn2
+ Y1Y2 + Y1Y3 + … + Y1Yn
+ Y2Y1 + Y2Y3 + … + Y2Yn
+…
+ YnY1 + YnY2 + … + YnYn-1 ]
40
= E[Y12] + E[Y22] + … + E[Yn2]
+ E[Y1Y2] + … + E[YnYn-1 ]
= n E[Y12] + n(n-1) E[Y1Y2]
(by symmetry)

The value of Y12 is either 1 (when Y1 = 1),


or 0 (when Y1 = 0)
The first case happens with 1/n chance
(when element 1 is in Bucket 0), so
E[Y12] = 1/n * 1 + (1- 1/n) * 0 = 1/n
41
For Y1Y2, it is either 1 (when Y1=1 and Y2=1),
or 0 (otherwise)
The first case happens with 1/n2 chance
(when both element 1 and element 2 are in
Bucket 0), so
E[Y1Y2] = 1/n2 * 1 + (1- 1/n2) * 0 = 1/n2

Thus, E[n02] = n E[Y12] + n(n-1) E[Y1Y2]


= n (1/n) + n(n-1) (1/n2)
= 2 – 1/n
42
Homework
• Exercise : 8.3-2
• Practice at home:
Exercises: 8.2-3, 8.2-4, 8.3-4, 8.4-2

43

You might also like