0% found this document useful (0 votes)
12 views61 pages

DSA Sorting

Chapter 7 of the document discusses various sorting algorithms, including Insertion Sort, Shellsort, Heapsort, Mergesort, and QuickSort, detailing their methodologies, complexities, and implementations. It highlights the characteristics of each algorithm, such as the average and worst-case performance, and provides code snippets for practical understanding. The chapter emphasizes the importance of choosing appropriate sorting techniques based on the data size and specific requirements of the sorting task.

Uploaded by

belkessambadis40
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
12 views61 pages

DSA Sorting

Chapter 7 of the document discusses various sorting algorithms, including Insertion Sort, Shellsort, Heapsort, Mergesort, and QuickSort, detailing their methodologies, complexities, and implementations. It highlights the characteristics of each algorithm, such as the average and worst-case performance, and provides code snippets for practical understanding. The chapter emphasizes the importance of choosing appropriate sorting techniques based on the data size and specific requirements of the sorting task.

Uploaded by

belkessambadis40
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 61

Data Structures and Algorithms 2

Prof. Ahmed Guessoum


The National Higher School of AI

Chapter 7
Sorting
Preliminaries
• We discuss the problem of sorting an array of elements.
• We will assume that the array contains only integers, although the
code will allow more general objects.
• We will assume that the entire sort can be done in main memory 
the number of elements is relatively small (less than a few million)
• External sorting (which must be done on disk or tape), will be
discussed at the end of the chapter.
• We will assume the existence of the “<” and “>” operators (besides
the assignment operator). They can be used to place a consistent
ordering on the input. Sorting under these conditions is known as
comparison-based sorting.
2
Sorting in the STL
• The interface that will be used is not the same as in the STL sorting algorithms.
• In STL, sorting (generally quicksort) is accomplished by use of the function
template sort.
void sort( Iterator begin, Iterator end );
void sort( Iterator begin, Iterator end, Comparator cmp );
• The iterators must support random access.
• The sort algorithm does not guarantee that equal items retain their original
order (if that is important, use stable_sort instead of sort).
std::sort( v.begin( ), v.end( ) );
// sort the entire container, v, in nondecreasing order
std::sort( v.begin( ), v.end( ), greater<int>{ } );
// sort container v in nonincreasing order
std::sort( v.begin( ), v.begin( ) + ( v.end( ) - v.begin( ) ) / 2 );
3
// sort first half of container v in nondecreasing order
Insertion Sort
• It is one of the simplest sorting algorithms.
• Insertion sort consists of N−1 passes.
• For pass p =1 through N−1, insertion sort ensures that the
elements in positions 0 through p are in sorted order
• It makes use of the fact that elements in positions 0 through
p−1 are already known to be in sorted order.
• In pass p , we move the element in position p left until its
correct place is found among the first p +1 elements.
• The element in position p is moved to tmp, and all larger
elements (prior to position p) are moved one spot to the right.
Then tmp is moved to the correct spot at the end. 4
// Simple insertion sort.
template <typename Comparable>
void insertionSort( vector<Comparable> & a )
{
for( int p = 1; p < a.size( ); ++p ) {
Comparable tmp = std::move( a[ p ] );
int j;
for( j = p; j > 0 && tmp < a[ j - 1 ]; --j )
a[ j ] = std::move( a[ j - 1 ] );
a[ j ] = std::move( tmp );
}
5
}
Analysis of Insertion Sort
• Because of the nested loops, each of which can take N iterations, insertion
sort is O (N 2).
• A precise calculation shows that the number of tests in the inner loop is at
most p + 1 for each value of p. Summing over all p gives a total of

• If the input is pre-sorted (or almost sorted), the running time is O (N),
because the test in the inner for loop always fails immediately

8
• An inversion in an array of numbers is any ordered pair (i, j) having the
property that i < j but a[i] > a[j].
• In the previous example: the input list 34, 8, 64, 51, 32, 21 had nine
inversions, namely (34, 8), (34, 32), (34, 21), (64, 51), (64, 32), (64, 21),
(51, 32), (51, 21), and (32, 21).
• This is exactly the number of swaps that needed to be (implicitly)
performed by insertion sort.
• This is always the case, because swapping two adjacent elements that are
out of place removes exactly one inversion, and a sorted array has no
inversions.
• Since there is O(N) other work involved in the algorithm, the running time
of insertion sort is O(I + N), where I is the number of inversions in the
original array
• Thus, insertion sort runs in linear time if the number of inversions is O(N).
9
• We can compute precise bounds on the average running time of insertion
sort by computing the average number of inversions in a permutation.
• We will assume that there are no duplicate elements
• Using this assumption, we can assume that the input is some permutation
of the first N integers
• Under these assumptions, we have the following theorem:
Theorem 7.1: The average number of inversions in an array of N distinct
elements is N (N − 1)/4. (Proof: See textbook.)

• This theorem implies that insertion sort is quadratic on average. It also


provides a very strong lower bound about any algorithm that only exchanges
adjacent elements.
Theorem 7.2: Any algorithm that sorts by exchanging adjacent elements
requires Ω(N 2) time on average. (Proof: See textbook.) 10
Shellsort
• Shellsort (Donald Shell) was one of the first algorithms to break the quadratic
time barrier
• It works by comparing elements that are distant; the distance between
comparisons decreases as the algorithm runs until the last phase, in which
adjacent elements are compared.
• For this reason, Shellsort is sometimes referred to as diminishing increment
sort.
• Shellsort uses a sequence, h1, h2, . . . , ht, called the increment sequence.
• Any increment sequence will do as long as h1 = 1, but some choices are
better than others.
• After a phase, using some increment hk, for every i, we have a[i ] ≤ a[i + hk]
• Important property of Shellsort: if you hk−1-sort an hk-sorted file, then it
remains hk-sorted 11
• The general strategy to hk-sort is for each position, i, sort the subarray starting
at i of elements obtained by increments hk; then take the next (smaller) hk and
so on.
• Suppose the increment sequence is 5, 3, 1

• A popular (but poor) choice for increment sequence is to use the sequence
suggested by Shell: ht = floor(N / 2), and hk = floor(hk+1 / 2).

12
// Shellsort routine using Shell’s increments (better increments are possible)
template <typename Comparable>
void shellsort( vector<Comparable> & a )
{
for( int gap = a.size( ) / 2; gap > 0; gap /= 2 )
for( int i = gap; i < a.size( ); ++i )
{
Comparable tmp = std::move( a[ i ] );
int j = i;
for( ; j >= gap && tmp < a[ j - gap ]; j -= gap )
a[ j ] = std::move( a[ j - gap ] );
a[ j ] = std::move( tmp );
}
}
13
Worst-Case Analysis of Shellsort
• The running time of Shellsort depends on the choice of increment sequence, and the proofs
can be rather involved.
• The average-case analysis of Shellsort is a long-standing open problem, except for the most
trivial increment sequences.
Theorem 7.3: The worst-case running time of Shellsort using Shell’s increments is ϴ(N 2).
• Hibbard suggested a slightly different increment sequence, which gives better results in
practice (and theoretically): 1, 3, 7, . . . , 2 k − 1.
• Key difference: consecutive increments have no common factors.
• For this increment sequence, we have the following theorem:
Theorem 7.4: The worst-case running time of Shellsort using Hibbard’s increments is ϴ(N 3/2).

• The performance of Shellsort is quite acceptable in practice, even for N in the tens of
thousands. The simplicity of the code makes it the algorithm of choice for sorting up to
moderately large input.
14
Heapsort
• Priority queues can be used to sort in O(N log N) time. The algorithm
based on this idea is known as heapsort
• Reminder from Chapter 6: basic strategy is
 build a binary heap of N elements. This stage takes on average O(N) time.
 then perform N deleteMin operations.
• The elements leave the heap smallest first, in sorted order.
• By recording these elements in a second array and then copying the
array back, we sort N elements.
• Since each deleteMin takes O(log N) time, the average running time of
Heapsort is O(N log N).
• The main problem with this algorithm is that it uses an extra array.
Thus, the memory requirement is doubled.
15
Alternative to doubling the array
• Make use of the fact that after each deleteMin, the heap shrinks by 1.
•  the cell that was last in the heap can be used to store the element that
was just deleted.
• Example, suppose we have a heap with six elements.
oThe first deleteMin produces a 1.
oNow the heap has only five elements,  we can place a 1 in position 6.
oThe next deleteMin produces a 2. The heap will now only have four
elements  we can place a 2 in position 5.
oAnd so on
• After the last deleteMin the array will contain the elements in decreasing
sorted order.
• If we want the elements in the more typical increasing sorted order, we can
change the ordering property so that the parent has a larger element than the16
child. Thus, we have a (max)heap.
Build a heap from the list 58, 41, 59, 26, 53, 97, 31

(Max) heap after buildHeap phase

Heap after first deleteMax

17
Mergesort
• The fundamental operation in Mergesort is merging two sorted lists.
• Since the lists are sorted, this can be done in one pass through the input
if the output is put in a third list.
• Basic merging algorithm:
• Input arrays A and B, and output array C,
• Use three counters, Actr, Bctr, and Cctr, initially set to the beginning
of their respective arrays.
• The smaller of A[Actr] and B[Bctr] is copied to the next entry in C,
and the appropriate counters are advanced.
• When either input list is exhausted, the remainder of the other list is
copied to C.
• The merge operation is clearly linear O(N) 18
Merge the following two arrays

19
The remainder of the B array is then copied to C

20
Mergesort algorithm
• If N = 1, there is only one element to sort, and the answer is the element
itself.
• Otherwise, recursively mergeSort the first half and the second half.
• This gives two sorted halves, which can then be merged together using
the merging algorithm described above.
• For instance, to sort the eight-element array 24, 13, 26, 1, 2, 27, 38, 15
• Recursively sort the first four and last four elements, obtaining
1, 13, 24, 26, 2, 15, 27, 38.
• Then merge the two halves as above, obtaining the final list
1, 2, 13, 15, 24, 26, 27, 38.
• This algorithm is a classic divide-and-conquer strategy.
21
• The problem is divided into smaller problems and solved recursively.
• The conquering phase consists of patching together the answers.
• Divide-and-conquer is a very powerful use of recursion that we will see
many times.
Mergesort routines
Merge routine
• The merge routine is subtle. If a temporary array is declared locally for
each recursive call of merge, then there could be logN temporary arrays
active at any point.
• A close examination shows that since merge is the last line of mergeSort,
there only needs to be one temporary array active at any point, and that
the temporary array can be created in the public mergeSort driver.
• Further, we can use any part of the temporary array; we will use the
same portion as the input array a.
22
Analysis of Mergesort
• Mergesort is a classic example of the techniques used to analyze recursive
routines: We have to write a recurrence relation for the running time.
• We will assume that N is a power of 2 so that we always split into even halves.
• For N = 1, the time to mergesort is constant, which we will denote by 1.
• Otherwise, the time to mergesort N numbers is equal to the time to do two
recursive mergesorts of size N/2, plus the time to merge, which is linear. So
T(1) = 1
T(N) = 2T(N/2) + N
Since we can substitute N/2 into the main equation,
2T(N/2) = 2(2(T(N/4)) + N/2) = 4T(N/4) + N

23
We have
T(N) = 4T(N/4) + 2N
Again, by substituting N/4 into the main equation, we see that
4T(N/4) = 4(2T(N/8) + N/4) = 8T(N/8) + N
So we have
T(N) = 8T(N/8) + 3N
Continuing in this manner, we obtain
T (N) = 2k T (N/ 2k) + k · N
Using k = log N, we obtain
T(N) = N T (1) + N logN = N logN + N

24
Remarks on MergeSort
• Though we assumed N = 2k, the analysis can be refined to handle cases when N is not
a power of 2. (Answer almost identical).
• Although mergeSort’s running time is O(N logN), it has the significant problem that
• merging two sorted lists uses linear extra memory; and
• the additional work involved in copying to the temporary array and back,
throughout the algorithm, slows the sort considerably.
• This copying can be avoided by judiciously switching the roles of a and tmpArray at
alternate levels of the recursion.
• A non-recursive implementation of mergeSort is also possible.
• The running time of mergeSort, when compared with other O(N logN) alternatives,
depends heavily on the relative costs of comparing elements and moving elements in
the array (and the temporary array).
• These costs are language dependent. (See the discussion Java vs C++ in textbook.)
25
QuickSort
• For C++, quicksort has historically been the fastest known
generic sorting algorithm in practice.
• Its average running time is O(N logN).
• It has O(N2) worst-case performance.
• By combining quicksort with heapsort, we can achieve
quicksort’s fast running time on almost all inputs, with
heapsort’s O(N logN) worst-case running time. (Left as Exercise
7.27, which describes this approach).
• Like mergeSort, quickSort is a divide-and-conquer recursive
algorithm.
26
QuickSort Algorithm
• Let us begin with the following simple sorting algorithm to sort a list.
• Given a list of items to sort
• Arbitrarily choose any item
• Form three groups: those smaller than the chosen item, those equal to the
chosen item, and those larger than the chosen item.
• Recursively sort the first and third groups
• Concatenate the three groups.
Implementation of this simple recursive sorting algorithm
• This algorithm forms the basis of quicksort. But it does not change much from
mergeSort, especially in terms of extra memory.
• quicksort is commonly written in a way that avoids creating the second group
(the equal items), and the algorithm has numerous subtle details that affect the
performance; therein lie the complications. 27
“Classic quicksort”
• The following is the most common implementation of quicksort.
• Only one array S is used; it is the input array.
Four steps:
1. If the number of elements in S is 0 or 1, then return.
2. Pick any element v in S. This is called the pivot.
3. Partition S − {v} (the remaining elements in S) into two disjoint
groups: S1 = {x ∈ S − {v} | x ≤ v}, and S2 = {x ∈ S − {v} | x ≥ v}.
4. Return {quicksort(S1) followed by v followed by quicksort(S2)}.
• The devil lies in the detail (of parts 2 and 3 of the algorithm)!

28
Example to sort a list of numbers

The pivot is selected randomly to be 65

29
30
Discussion
• The previous algorithm works, but is it any faster than mergesort?
• Like mergesort, it recursively solves two subproblems and requires linear
additional work.
• But, unlike mergesort, the subproblems are not guaranteed to be of equal
size, which is potentially bad.
• quicksort is faster because the partitioning step can actually be performed
in place and very efficiently.
• This efficiency more than makes up for the lack of equal-sized recursive
calls.

31
Picking the Pivot
• The algorithm as described works whichever element is chosen as pivot;
some choices are obviously better than others.
• Choosing the first element as pivot:
• Acceptable if the input is random;
• If the input is pre-sorted (or has a large pre-sorted section) or in reverse
order, then the pivot provides a poor partition, because either all the
elements go into S1 or they go into S2.
• If the input is pre-sorted, then quicksort will take quadratic time to do
essentially nothing at all.
• Pre-sorted input is quite frequent, so using the first element as pivot is a
very bad idea.
• A safe course is to choose the pivot randomly, unless the random number
generator has a flaw. 32
Median-of-three Partitioning
• The median of a group of N numbers is the ceiling(N / 2)th largest number.
• The best choice of pivot would be the median of the array.
• This is hard to calculate and would slow down quicksort considerably.
• A good estimate can be obtained by picking three elements randomly and
using the median of these three as pivot.
• The randomness turns out not to help much, so the common course is to use
as pivot the median of the left, right, and centre elements.
• Example: for the input 8, 1, 4, 9, 6, 3, 5, 2, 7, 0
leftElt is 8; rightElt is 0; centerElt is in position floor(left + right)/2) i.e. 6
• Using median-of-three partitioning reduces the number of comparisons by
14%. 33
Partitioning Strategy
• Several partitioning strategies are used in practice; the one described here is known
to give good results.
• The first step is to get the pivot element out of the way by swapping it with the last
element.
• i starts at the first element and j starts at the next-to-last element.
• The partitioning stage wants to move all the small elements to the left part of the
array and all the large elements to the right part. (“Small” and “large” are relative to
the pivot.)
• While i is to the left of j, it is moved right, skipping over elements that are smaller
than the pivot.
• j is moved left, skipping over elements that are larger than the pivot.
• When i and j have stopped, i is pointing at a large element and j is pointing at a
small element.
• If i is to the left of j, those elements are swapped
34
swap the elements pointed to by i and j. Repeat the process until i and j cross.

35
Now, i and j have crossed, so no swap is performed.
Partitioning final part: swap the pivot element with the element pointed to by i :

36
Handling equal elements
• One important detail we must consider is how to handle elements that are
equal to the pivot ( will be in S2).
• Consider the case where all the elements in the array are identical.
• If both i and j stop, there will be many swaps between identical elements.
• Although this seems useless, the positive effect is that i and j will cross in
the middle, so when the pivot is replaced, the partition creates two nearly
equal subarrays.
• The mergesort analysis tells us that the total running time would then be
O(N logN).

37
Handling equal elements
• If neither i nor j stops, and code is present to prevent them from
running off the end of the array, no swaps will be performed.
• Although this seems good, a correct implementation would then
swap the pivot into the last spot that i touched, which would be
the next-to last position (or last, depending on the exact
implementation).
 This would create very uneven subarrays.
 If all the elements are identical, the running time is O (N2).
• It is better to do the unnecessary swaps and create even subarrays
than to risk wildly uneven subarrays.
• Therefore, we will have both i and j stop if they encounter an
38
element equal to the pivot.
Handling Small Arrays
• For very small arrays (N ≤ 20), quicksort does not perform as well as
insertionSort.
• Also, because quicksort is recursive, these cases will occur frequently.
• A common solution is not to use quicksort recursively for small arrays, but
instead use a sorting algorithm that is efficient for small arrays, such as
insertionSort.
• Using this strategy can actually save about 15 percent in the running time
(over doing no cutoff at all).
• A good cutoff range is N = 10, although any cutoff between 5 and 20 is
likely to produce similar results.
Quicksort routines
39
Analysis of Quicksort
• The worst-case bound for quicksort is ϴ (N2).
• Best-Case Analysis gives ϴ(N log N).
• Average-case O(N log N).

• For the proofs, we will assume:


• a random pivot (no median-of-three partitioning) and
• no cutoff for small arrays
• Obviously, we will take T(0) = T(1) = 1
• Since running time of quicksort is equal to running time of the two recursive
calls plus the linear time spent in the partition (the pivot selection takes only
constant time), the basic quicksort relation
T(N) = T(i) + T(N − i − 1) + cN
where i = |S1| is the number of elements in S1. 40
3 cases for the Proofs of the Analyses
Worst-Case Analysis:
• The pivot is the smallest element, all the time. Then i = 0, and if we ignore
T(0) = 1 (insignificant), the recurrence is T(N) = T(N − 1) + c N, N > 1
Using the basic quicksort relation
T(N − 1) = T(N − 2) + c (N − 1)
T(N − 2) = T(N − 3) + c(N − 2)
...
T(2) = T(1) + c (2)
Adding up all these equations yields

41
Best-Case Analysis:
• Best case: the pivot is in the middle.
• We assume that the two subarrays are each exactly half the size of the
original (a slight overestimate; but ok for Big-Oh complexity).
T (N) = 2 T (N/2) + c N  T(N) / N = T(N/2) / (N/2) + c
Likewise, T(N/2) / (N/2) = T(N/4) / (N/4) + c
T(N/4) / (N/4) = T(N/8) / (N/8) + c
Etc. until T(2) / 2 = T(1) / 1 + c
Adding up all these equations (there are log N of them), we get:
T(N) / (N) = T(1) / 1 + c log N
which yields (same results as mergeSort, Section 7.8):
T(N) = c N log N + N = Θ (N log N)
42
Average-Case Analysis:
• The analysis starts from equation T(N) = T(i ) + T(N − i − 1)
+ c N seen earlier
• It is a bit lengthier but with simple algebra steps
• It eventually leads to T(N) = O(N logN)

43
Basic Tree Properties and Lower Bounds
• Lemma 7.1: Let T be a binary tree of depth d. Then T has at
most 2d leaves.

• Lemma 7.2: A binary tree with L leaves must have depth at least
ceiling(log L).

• Theorem 7.6: Any sorting algorithm that uses only comparisons


between elements requires at least ceiling(log(N!)) comparisons.

• Theorem 7.7: Any sorting algorithm that uses only comparisons


between elements requires on average Ω(N log N) comparisons.
47
A General Lower Bound for Sorting
• We have algorithms for sorting with O (N log N); but is it as good
as we can do?
• As just stated, any algorithm for sorting that uses only
comparisons requires Ω (N log N) comparisons (and hence time)
in the worst case.
==> So mergesort and heapsort are optimal to within a constant
factor.
• The proof can be extended to show that Ω (N log N) comparisons
are required, even on average, for any sorting algorithm that uses
only comparisons.
==> So quicksort is optimal on average to within a constant factor.
48
Linear-Time Sorts: Bucket Sort
(https://www.programiz.com/dsa/bucket-sort)

• Bucket Sort divides the unsorted array elements into several groups
called buckets.
• Each bucket is then sorted by using any of the suitable sorting
algorithms, or recursively applying the same bucket algorithm.
• Finally, the sorted buckets are combined to form a final sorted array.
• The process of bucket sort can be understood as a scatter-gather
approach:
• elements are first scattered into buckets;
• then the elements in each bucket are sorted;
• finally, the elements are gathered in order.
49
Working of Bucket Sort
Suppose, the input array having values between 0 and 1 is:
0.42 0.32 0.23 0.52 0.25 0.47 0.51

Create an array of size 10. Each slot of this array is used as a bucket for storing elements.
0 0 0 0 0 0 0 0 0 0
• Insert elements into the buckets from the array. The elements are inserted according to
the range of the bucket.
• Suppose we have buckets each of ranges from 0 to 1, 1 to 2, 2 to 3,......, (n-1) to n.
• Suppose, input element is 0.23. It is multiplied by size = 10 (i.e. 0.23*10 = 2.3).
• Then, it is converted into an integer (i.e. floor(2.3) = 2).
• Finally, 0.23 is inserted into bucket-2.
• If integer taken as input, then divide it by the interval (10 here) and take the floor value.
50
Insert all the elements into the buckets from the array

0 0 0.23 0.32 0.42 0.52 0 0 0 0


0.25 0.47 0.51

Sort the elements in each bucket separately using some sorting algorithm
(insertionsort, quicksort, etc.)

0 0 0.23 0.32 0.42 0.51 0 0 0 0


0.25 0.47 0.52

The elements from each bucket are gathered by iterating through the buckets and copying
the bucket elements into the original array. (Each element from the bucket is erased once it
is copied into the original array.)
0.23 0.25 0.32 0.42 0.47 0.51 0.52

51
Analysis of Bucket Sort
• The input A1, A2, . . . , AN must be only positive integers smaller than M.
• The array of buckets of size M, is initialized to all 0s. So it has M buckets,
initially empty.
Worst Case Complexity: O(N2)
• When there are elements of close range in the array, they are likely to be
placed in the same bucket.
 Some buckets may have more elements than others.
 The complexity will depend on the sorting algorithm used to sort the
elements of the bucket.
The complexity becomes even worse when the elements are in reverse order.
• If insertion sort is used to sort elements of the bucket, then the worst time
complexity becomes O(N2). 52
Best Case Complexity: O(N+M)
• Best case occurs when the elements are uniformly distributed in the
buckets with a nearly equal number of elements in each bucket.
• The complexity becomes even better if the elements inside the buckets
are already sorted.
• If insertion sort is used to sort elements of a bucket then the overall
complexity in the best case will be linear i.e. O(N+M).
where O(N) is the complexity for making the buckets and
O(M) is the complexity for sorting the elements of the bucket
using algorithms having linear time complexity in the best
case.

53
Average Case Complexity: O(N)
• It occurs when the elements are distributed randomly in the
array.
• Even if the elements are not distributed uniformly, bucket
sort runs in linear time.
• This remains true until the sum of the squares of the bucket
sizes is linear in the total number of elements.

Application: Bucket sort is mainly useful when input is


uniformly distributed over a range of values.
54
Counting Sort (https://www.programiz.com/dsa/counting-sort)
• Counting Sort is a non-comparison-based sorting algorithm that works well
when there is a limited range of input values.
• It is particularly efficient when the range of input values is small compared to the
number of elements to be sorted.
• Sorts the elements of an array by counting the number of occurrences of each
unique element in the array.
• The count is stored in an auxiliary array and the sorting is done by mapping the
count as an index of the auxiliary array.

Working of Counting Sort:


Find out the maximum element, max, from the given array.

55
Initialize an array of length max+1 with all elements set to 0.
This array is used for storing the count of the elements in the array.

Store the count of each element at their respective index in count array

Store the cumulative sum of the elements of the count array.


It helps in placing the elements into the correct index of the sorted array.

56
Find the index of each element of the original array in the count array.
This gives the cumulative count.
Place the element at the index calculated as shown in the figure below.

After placing each element at its correct position, decrease its count by one. 57
Counting Sort
Worst Case Complexity: O(N+M)complexities
Best Case Complexity: O(N+M)
Average Case Complexity: O(N+M)
where N is the size of the input array and M is the size of the
count array.

In all the above cases, the complexity is the same because no


matter how the elements are placed in the array, the algorithm goes
through them N+M times.

58
Advantages & Disadvantages of Counting Sort
Advantage of Counting Sort:
• Counting sort generally performs faster than all comparison-based sorting
algorithms, such as merge sort and quicksort.
• Counting sort is easy to code
• Counting sort is a stable algorithm.

Disadvantage of Counting Sort:


• Counting sort does not work on decimal values.
• Counting sort is inefficient if the range of values to be sorted is very large.
• Counting sort is not an In-place sorting algorithm, It uses extra space for
sorting the array elements.
59
Linear-Time Sorts: Radix Sort (programiz.com)
Radix sort sorts the elements by
• first grouping the individual digits of the same place value,
• then, sorting the elements according to their increasing/decreasing
order.

60
61
Working of Radix Sort
1. Find the largest element in the array, i.e. max. Let X be the number of
digits in max.
X is calculated because we have to go through all the significant
places of all elements.
In the example [121, 432, 564, 23, 1, 45, 788], 788 is the largest. It
has 3 digits. Therefore, the loop should go up to hundreds place (3
times).

2. Now, go through each significant place one by one.


Use any stable sorting technique to sort the digits at each significant
place. E.g. counting sort.
Sort the elements based on the unit place digits (X=0). 62
63
Now, sort the elements based on digits at tens place

Finally, sort the elements based on the digits at hundreds place.

64
Analysis of Radix Sort
 Since Radix Sort is a non-comparative algorithm, it has advantages over
comparative sorting algorithms.
 For the Radix Sort that uses counting sort as an intermediate stable
sort, the time complexity is O(d * (N+M)). Where d is the number of
cycles and O(N+M) is the time complexity of Counting Sort.
 Radix Sort has linear time complexity which is better than O(N log N) of
comparative sorting algorithms.
 If we take very large digit numbers or the number of other bases like
32-bit and 64-bit numbers, then it can perform in linear time.
 However the intermediate sort takes large space.  Radix Sort is
space-inefficient.
 This is why this algorithm is not used in software libraries. 65
Slides based on the textbook
Mark Allen Weiss,
(2014 ) Data
Structures and
Algorithm Analysis
in C++, 4th edition,
Pearson.

Acknowledgement: This course PowerPoints make substantial (non-exclusive) use of


the PPT chapters prepared by Prof. Saswati Sarkar from the University of Pennsylvania,
USA, themselves developed on the basis of the course textbook. Other references, if any,
will be mentioned wherever applicable.
66

You might also like