Data Structures (67109-2) - Recitation 4: 1 Linear Time Sorting
Data Structures (67109-2) - Recitation 4: 1 Linear Time Sorting
26-30.4.2024
The Hebrew University of Jerusalem
&
sort algorithm ensures a comes before b in the sorted list. age 3
=
Bina
nom =
L
:
4
/31
est
1.1 Counting Sort
on sait pas qui
qui
b
&
si
Counting Sort is a sorting technique based on keys between a specific range.
Input: n integer numbers in the range [0, . . . , k] where k = O (n) is an integer.
The idea: determine for each input element x its rank - the number of elements less than x. Once
we know the rank r of x, we can place it in position r + 1.
Algorithm 1 Counting Sort
Input: arr: unsorted array of integersœ {0} fi [k], k: largest number in the array.
Output: sorted_arr: sorted arr.
1 Function Counting-Sort(arr, k):
/* allocate counting array and set to zero */
2 C =Allocate[0 . . . k] cette allocat do mémoire vant 0(1)
3 for i = 0 to k do mettre to les ([i) a 0 0(k) =
Û « (k)
4 C [i] = 0
5 for j = 0 to length(arr)≠1 do Û « (n)
↑
E 6 C [arr [j]] + = 1 // now C [i] contain the number of elements = to i
7 for i = 1 to k do Û « (k)
2[ 8 C [i] + = C [i ≠ 1] // now C [i] contain the number of elements Æ to i
9 sorted_arr =Allocate[0 . . . n ≠ 1]
[
10 for j =length(arr)≠1 downto 0 do Û « (n)
·
11 sorted_arr [C [arr [j]] ≠ 1] = arr [j] // place element
12 C [arr [j]] ≠ = 1 // reduce by 1
[1]
1 202 6 ①
arr =
k = 6
f On metds C
C -
-
2 120001
de
ab on a
0123456
- chaque élément
k + 1 = 6
2 Si ri
ya
:
3
on les dsarr[v
place : r+ c[i]]
0 = 234 S
Sorted aur- -
o 01 2126
A ↑
Cb st +
petit que 0
?
ab st + petit
le met que
1
? 2 éléments
> O donc on
*onc met 1 de la
-
O à la
on
de la case
Case 2 à 2+ c[1] = 3
Case c[0] = 2
aut [7 2 , 0, 2, 6 , 03
([arr(o]] C[1]
=
, = =
-c 4 117 111111 +
CCarr[1]] [2]
-
=
<
0123456 =
Clarr[3]] = <[0]
155151
x =
CCarr[n]] =
c [2]
3 Sortedarr =
2012/215
12343
:
O e ([i] + Ci -1] :
sor-arr[]] ) arr(s) 0
3
=
=
2
-
I
([2] + ((z] => 2 + 3 = 5
arr[4] S
sor [C(arr[i]
u 1] - =
=
([3] + <(2) -
> 0 + S= 5
e
S
Let’s show an example run of Counting Sort:
arr 1 4 1 2 7 5 2 3 1 7
Lines 5-6
0 1 2 3 4 5 6 7 8 9
Lines 7-11 9
8
7
6
5
4
3
2
1
0
0 1 2 3 4 5 6 7 8 9
1.1.2 Loop Invariant In the last few TA’s and exercises you saw a few examples for proving
correctness of algorithms. This will be a key topic in your next course in Algorithms, but it is
also important for our course. As you previously saw, we prove an algorithm’s correctness using
an induction, which is proved on the iterations of the algorithm (each iteration is accounted
as the i of the induction). For proving the correctness, we sometimes use what is called a loop
invariant. Knowing a code’s invariant is essential in understanding the effect of a loop and proving
its correctness.
Definition 2. Loop Invariant
A loop invariant is a property of a program loop that is true before (and after) each iteration.
It is a logical assertion, sometimes checked within the code by an assertion call. There are
three parts of a loop invariant that we need to show:
1. Initialization: It is true prior to the first iteration of the loop.
2. Maintenance: If it is true before an iteration of the loop, it remains true before the
next iteration.
3. Termination: When the loop terminates, the invariant gives us a useful property that
helps show that the algorithm is correct.
[2]
1.2 Radix Sort
Radix sort is a non-comparative sorting algorithm. It avoids comparison by creating and distribut-
ing elements into buckets according to their radix. For elements with more than one significant
digit, this bucketing process is repeated for each digit, while preserving the ordering of the prior
step, until all digits have been considered.
Input: n integer numbers with (at most) d digits.
The idea: look at one digit at a time and sort the numbers according to this digit only.
Start from the least significant digit, working up to the most significant one. Since there are only
10 different digits, only 10 places are used for each column, so we can use counting sort for each
call, with k = 9, and since this is constant, we definitely have k = O (1) = O (n).
Algorithm 2 Radix Sort
Input Output
Space complexity: S (n) = « (n). We can use one more array (« (n)) to copy the temporary
results at each iteration and pass it as the target sorting array to the counting sort in the following
iteration (line 3). In addition, if k = 10, then the counting array C can be allocated once and
reused in each iteration, resulting in a constant space complexity of O (1). Therefore, the overall
space complexity of the algorithm is « (n).
[3]
Remark 1. This algorithm MUST use a stable sort (def. 1), otherwise it fails.
Loop Invariant: In the outer for-loop, just before the ith iteration, the integers in array
arr are sorted according to the values induced by their xi≠1 value.
Initialization: Before the iteration with i = 0, the array is not sorted, which is trivially
true.
Maintenance: Assume that the loop invariant holds before the ith iteration; we aim to
prove that it holds also before the i + 1 iteration. To do this, we must establish
that for any pair of integers x, y in arr, their relative order is determined by the
values of xi and yi . WLOG, assume xi < yi , it is necessary to show that x will
appear before y. There are two cases:
1. The ith digit of x < ith digit of y - Radix-Sort will put x before y in the current
round of sorting.
2. The ith digit of x=ith digit of y - Since xi < yi , by the loop invariant,
xi≠1 < yi≠1 , so x appears before y before the ith iteration, and since the ith
digits are the same, their order will not change in the new iteration, so they
will remain in the same order.
Termination: When i = d, the array is sorted. You will prove by invariant that the algo-
rithm is stable in ex4.
Bucket sort, is a sorting algorithm that works by distributing the elements of an array into a number
of buckets. Each bucket is then sorted individually, either using a different sorting algorithm, or
by recursively applying the bucket sorting algorithm.
Input: n real numbers in the interval [0, . . . , 1), uniformly distributed. (this will be explained
further next week, but for now we may think of this as the chance of getting a number x and a
number y are equally likely).
The idea: Divide the interval [0, . . . , 1) to n buckets 0, n1 , n2 , . . . , n≠1
n
. Put each element ai into its
matching bucket, such that n Æ ai Æ n . Since the numbers are uniformly distributed, not too
i i+1
many elements will be placed in each bucket. If we insert them in order (using Insertion-Sort), the
buckets and the elements in them will always be in sorted order.
[4]
Algorithm 3 Bucket Sort
Input: arr: unsorted array of n real numbers in the interval [0, 1).
Output: Sorted arr.
1 Function Bucket-Sort(arr):
2 n = length(arr)
3 B Ω a list of n lists // B [i] are the buckets
4 for i = 0 to n ≠ 1 do
5 insert arr [i] into list B [Ân · arr [i]Ê] // arr [i] œ [0, 1), Ân·arr[i]Ê
n
Æ arr [i] Æ Ân·arr[i]Ê+1
n
6 for i = 0 to n ≠ 1 do
7 Insertion-Sort(B [i])
8 concatenate lists B [0] , . . . , B [n ≠ 1] in order
72 17 39 26 78 94 21 12 23 68
23
12 21 78
17 26 39 68 72 94
21
12 23 72
17 26 39 68 78 94
12 17 21 23 26 39 68 72 78 94
Remark 3. Sorting buckets (line 7) can be done with any sorting algorithm. Insertion sort is effi-
cient when dealing with small inputs, which is why we prefer it.
Remark 4. We can use Bucket Sort for n real numbers in any interval [a, b). To do this, we need
to normalize the input, use bucket sort, and then unnormalize the output.
[5]
The initialization and first loop (line 4) take O (n).
worst case In the worst case, n numbers end up in the same bucket and the algorithm takes O (n2 )
(line 7).
Average case In the average case only a constant number of elements fall in each bucket, at least
with very high probability, so inesrtion sort takes O (1) on each bucket, and therefore
the second loop also takes O (n).
Space complexity: S (n) = « (n). The Space complexity for allocating B (line 3) is « (n).
So to summarize this recitation, with some additional assumptions, we can sort elements in optimal
time and space œ (n).
oui autla :
S
-
2 for i = 0 to 10 do => j+ i =
9
3 j-=1
Maintenance
:
C :
j+ i = g
Solution . i + j == 9
I : max= o
A: array
-
[6]
Algorithm 4 Quick Sort
[7]