Sorting Algorithms Analysis
Sorting Algorithms Analysis
Abstract—This report presents a comparative analysis of the c) Average case: On average, each element must be
performance of five prominent sorting algorithms: Insertion compared to half of the already sorted list.
Sort, Merge Sort, Quick Sort, Bucket Sort, and Radix Sort. The
2) Merge Sort:
evaluation is conducted by measuring the execution times of these
algorithms across datasets of varying sizes, under both sorted and a) Best/Worst/Average case: Merge Sort consistently di-
unsorted conditions. Through this analysis, we aim to identify the vides the dataset into smaller subarrays and merges them in a
most efficient sorting algorithms based on the characteristics of sorted manner, resulting in predictable time complexity across
the datasets, providing insights into their suitability for different all cases. Its divide-and-conquer approach makes it highly
data scenarios.
efficient for large datasets, although it requires additional
memory for the merging process.
I. I NTRODUCTION 3) Quick Sort:
a) Best case: Achieved when the pivot divides the
A. Background on sorting, usage and applications
dataset evenly in every recursive call.
Sorting is a fundamental operation in computer science b) Worst case: Occurs when the pivot is consistently the
and plays a crucial role in organizing data efficiently. It smallest or largest element, resulting in unbalanced partitions.
involves arranging elements in a specific order, typically in c) Average case: In most cases, Quick Sort performs
ascending or descending sequence. Efficient sorting improves efficiently with balanced partitions, making it faster than other
the performance of other algorithms that require sorted data as algorithms in practice.
input, such as search algorithms, graph traversal techniques,
4) Bucket Sort:
and database queries. Sorting also has numerous applications
a) Best case: When data is uniformly distributed across
in diverse domains, including numerical computations, data
the buckets, the algorithm sorts the elements in linear time.
analysis, and real-time systems.
Sorting algorithms are vital in various real-world scenarios, b) Worst case: If all elements end up in a single bucket,
from processing large datasets in databases and search engines it degrades to quadratic complexity, requiring sorting within
to optimizing routing algorithms in networks. The choice of the bucket.
sorting algorithm can significantly affect the overall system c) Average case: Generally performs well when data
performance, making it essential to understand their behavior is uniformly distributed and the number of buckets (k) is
in different contexts. optimized.
5) Radix Sort:
B. Details on time complexity of the algorithms involved a) Best/Worst/Average case: Radix Sort processes each
(best/worst/average) digit of the numbers individually, with time complexity depen-
dent on the number of digits (k) and the number of elements
TABLE I (n). It performs efficiently on data with limited digit ranges.
T IME C OMPLEXITIES OF S ORTING A LGORITHMS
Algorithm Best Case Average Case Worst Case
II. M ETHODOLOGY
Insertion Sort O(n) O(n2 ) O(n2 )
Merge Sort O(n log n) O(n log n) O(n log n) A. Data Generation
Quick Sort O(n log n) O(n log n) O(n2 )
Bucket Sort O(n + k) O(n + k) O(n2 )
Radix Sort O(nk) O(nk) O(nk)
In our methodology, data was generated in two distinct or-
ders—sorted and unsorted—for varying dataset sizes. Python
1) Insertion Sort: was employed to create datasets ranging from 1 to 10,000,000
a) Best case: Occurs when the input is already sorted. elements. The unsorted data was generated by populating the
The algorithm traverses the list once. datasets with random numbers within the range of 0 to 1000.
b) Worst case: Happens when the input is sorted in Conversely, the sorted datasets were created by sequentially
reverse order, requiring maximum comparisons and shifts. incrementing numbers within the same range. This approach
allowed for a thorough performance evaluation of the sorting
978-1-6654-5992-1/22/$31.00 ©2022 IEEE algorithms under both ordered and unordered conditions.
B. Implementation of Algorithms 0: pivot ← array[high]
0: i ← low - 1
1) Insertion Sort: Implementation
0: for j = low to high - 1 do
InsertionSort(array)
0: if array[j] ≤ pivot then
0: for step = 1 to length(array) - 1 do 0: i←i+1
0: key ← array[step] 0: Swap(array[i], array[j])
0: j ← step - 1 0: end if
0: while j ≥ 0 and key < array[j] do 0: end for
0: array[j + 1] ← array[j] 0: Swap(array[i + 1], array[high])
0: j←j-1 0: return i + 1 =0
0: end while
0: array[j + 1] ← key
0: end for=0
4) Bucket Sort: Implementation
BucketSort(array)
2) Insertion Sort: Implementation 0: if length(array) ¿ 1 then
MergeSort(array) 0: Create empty buckets
0: for each element in array do
0: if length(array) ¿ 1 then
0: index ← 10 * element / MAX RANGE
0: r ← length(array) // 2
0: Insert element into bucket[index]
0: L ← array[0:r]
0: end for
0: M ← array[r:length(array)]
0: for each bucket do
0: MergeSort(L)
0: Sort(bucket[i])
0: MergeSort(M)
0: end for
0: i←j←k←0
0: Combine all buckets into array
0: while i ¡ length(L) and j ¡ length(M) do
0: end if=0
0: if L[i] ¡ M[j] then
0: array[k] ← L[i]
0: i←i+1
0: else 5) Radix Sort: Implementation
0: array[k] ← M[j] RadixSort(array)
0: j←j+1 0: maxElement ← max(array)
0: end if 0: place ← 1
0: k←k+1 0: while maxElement // place ¿ 0 do
0: end while 0: CountingSort(array, place)
0: while i ¡ length(L) do 0: place ← place * 10
0: array[k] ← L[i] 0: end while=0
0: i←i+1
0: k←k+1 CountingSort(array, place)
0: end while 0: size ← length(array)
0: while j ¡ length(M) do 0: output ← array of size 0
0: array[k] ← M[j] 0: count ← array of size 10 filled with 0
0: j←j+1 0: for i = 0 to size - 1 do
0: k←k+1 0: index ← array[i] // place
0: end while 0: count[index % 10] ← count[index % 10] + 1
0: end if=0 0: end for
0: for i = 1 to 9 do
0: count[i] ← count[i] + count[i - 1]
0: end for
3) Quick Sort: Implementation
0: for i = size - 1 to 0 do
QuickSort(array, low, high) 0: index ← array[i] // place
0: if low ¡ high then 0: output[count[index % 10] - 1] ← array[i]
0: pi ← Partition(array, low, high) 0: count[index % 10] ← count[index % 10] - 1
0: QuickSort(array, low, pi - 1) 0: end for
0: QuickSort(array, pi + 1, high) 0: for i = 0 to size - 1 do
0: end if=0 0: array[i] ← output[i]
Partition(array, low, high) 0: end for=0
III. R ESULTS F. Comparative Analysis
The performance of the sorting algorithms was evaluated 1) For small datasets (1 to 100 elements), Insertion Sort
on two types of datasets: unsorted and sorted. The following excels, while Merge Sort and Quick Sort provide reliable
analysis highlights their execution times across various array performance.
sizes: 2) As the dataset size increases, Merge Sort and Quick
Sort emerge as the better options for unsorted data, with
A. Insertion Sort Bucket Sort also showing strong results.
3) For sorted data, Insertion Sort shows significant im-
1) Unsorted Data: Insertion Sort performs efficiently on
provements, while Quick Sort’s performance deteriorates
smaller datasets (up to 10 elements) with execution times
considerably, making it less ideal for already sorted
below 0.00005 seconds. However, its performance degrades
datasets.
significantly as the array size increases, taking over 351
4) In general, Merge Sort remains consistent across varying
seconds for 100,000 elements and exceeding 10 minutes for
array sizes, making it suitable for both sorted and
arrays of 1,000,000 and 10,000,000 elements.
unsorted data, while Bucket Sort and Radix Sort demon-
2) Sorted Data: The execution time is much improved for
strate superior efficiency for specific conditions.
sorted data, with the fastest execution at 0.000018 seconds
for a single element and only 0.140963 seconds for 1,000,000
elements.
B. Merge Sort
1) Unsorted Data: This algorithm consistently shows ef-
ficient performance across all array sizes, with execution
times ranging from 0.000029 seconds for 1 element to about
0.305699 seconds for 100,000 elements.
2) Sorted Data: Merge Sort’s execution time increases to
33.211021 seconds for 10,000,000 elements, but it remains
efficient compared to others, making it a solid choice for large
datasets.
C. Quick Sort
1) Unsorted Data: Quick Sort performs well on unsorted
data, taking only 55.208344 seconds for 1,000,000 elements,
but it also has cases where performance significantly degrades,
especially on sorted data.
2) Sorted Data: The execution time rises dramatically to
more than 10 minutes for both 1,000,000 and 10,000,000
elements, indicating that Quick Sort may not be the best option
for already sorted datasets.
D. Bucket Sort
1) Unsorted Data: This algorithm demonstrates superior
performance with the best execution times of 0.000716 sec-
onds for 1,000 elements and 1.456125 seconds for 1,000,000
elements. However, performance suffers on larger datasets.
2) Sorted Data: Bucket Sort performs consistently better
with sorted data, taking 9.858302 seconds for 10,000,000
elements, showing its effectiveness in handling larger datasets.
E. Radix Sort
1) Unsorted Data: Radix Sort shows competitive perfor-
mance, with execution times around 1.720363 seconds for
1,000,000 elements, indicating efficiency for larger datasets.
2) Sorted Data: The execution times for sorted data reveal
that Radix Sort remains efficient, taking about 13.558841
seconds for 10,000,000 elements.
ARR SIZE Insertion Sort Merge Sort Quick Sort Bucket Sort Radix Sort
1 0.000029s 0.000029s 0.000018s 0.000018s 0.000030s
10 0.000034s 0.000044s 0.000033s 0.000039s 0.000046s
100 0.000465s 0.000329s 0.000185s 0.000164s 0.000198s
1000 0.035414s 0.001979s 0.001148s 0.000716s 0.000860s
10000 2.781630s 0.020365s 0.016549s 0.007563s 0.010893s
100000 351.657470s 0.305699s 0.795360s 0.102353s 0.141462s
1000000 ≥ 10m 3.643683s 55.208344s 1.456125s 1.720363s
10000000 ≥ 10m 46.158042s ≥ 10m 16.251020s 18.171996s
TABLE II
E XECUTION T IMES FOR U NSORTED DATA
ARR SIZE Insertion Sort Merge Sort Quick Sort Bucket Sort Radix Sort
1 0.000018s 0.000014s 0.000012s 0.000014s 0.000024s
10 0.000021s 0.000034s 0.000040s 0.000038s 0.000042s
100 0.000056s 0.000253s 0.000596s 0.000092s 0.000110s
1000 0.000142s 0.001368s 0.043505s 0.000617s 0.000862s
10000 0.001310s 0.017597s 4.382913s 0.005653s 0.011241s
100000 0.018397s 0.269224s ≥ 10m 0.089493s 0.136276s
1000000 0.140963s 2.825920s ≥ 10m 0.903497s 1.227430s
10000000 1.480360s 33.211021s ≥ 10m 9.858302s 13.558841s
TABLE III
E XECUTION T IMES FOR S ORTED DATA
IV. C ONCLUSIONS You can access this project in view by clicking on the given
link: https://www.overleaf.com/read/grncswyjcfvzca6360
The analysis of various sorting algorithms on both unsorted
and sorted datasets reveals distinct performance characteristics R EFERENCES
that can guide algorithm selection based on specific use cases. [1] https://www.programiz.com/dsa/insertion-sort
Insertion Sort, while fast for small arrays, shows significant [2] https://www.programiz.com/dsa/merge-sort
[3] https://www.programiz.com/dsa/quick-sort
performance degradation with larger datasets. Its advantage [4] https://www.programiz.com/dsa/bucket-sort
lies primarily in its efficiency for nearly sorted or small [5] https://www.programiz.com/dsa/radix-sort
datasets, where it can outperform more complex algorithms. [6] https://www.geeksforgeeks.org/time-complexities-of-all-sorting-
algorithms/
This makes it suitable for scenarios where datasets are small [7] https://www.wild-inter.net/publications/munro-wild-2018
or have limited disorder. [8] https://www.wild-inter.net/publications/munro-wild-2018
Merge Sort consistently demonstrates reliable performance [9] https://www.python-engineer.com/posts/measure-elapsed-time/use-a-
decorator-to-measure-the-time-of-a-function
across all array sizes and types of data. It handles larger [10] https://www.geeksforgeeks.org/chain-multiple-decorators-in-python/
datasets efficiently, making it a robust choice for applica- [11] https://stackoverflow.com/a/3323013
tions requiring stable and predictable performance. Its time
complexity of O(nlogn) allows it to manage substantial data
volumes effectively, making it ideal for scenarios in which data
integrity and stability are crucial, such as in database sorting
or large-scale data processing tasks.
Quick Sort offers competitive performance on unsorted
datasets, particularly when space efficiency is a concern.
However, its vulnerability to poor performance on sorted data
underscores the importance of analyzing data characteristics
before implementation.
Bucket Sort and Radix Sort stand out for their ability to
manage larger datasets efficiently under specific conditions.
Their performance illustrates the potential benefits of choosing
algorithms based on data distribution and structure.
Ultimately, the choice of sorting algorithm should align
with the specific requirements of the task at hand, considering
factors such as data size, structure, and performance needs. By
understanding the strengths and limitations of each algorithm,
developers can make informed decisions that optimize sorting
operations and improve overall system performance.