Sorting Assignment
Sorting Assignment
Assignment
Objective
Students will implement IntroSort and TimSort in C++, compare their performance with STL
sort, QuickSort, and MergeSort, and prepare a report based on their observations.
Task Details
Report Format
Title Page
1. Introduction
2. Implementation Details
Explain the logic behind each sorting algorithm implemented.
Provide C++ code snippets (or references to full code in the appendix).
Mention optimizations used, if any.
3. Experimental Setup
5. Discussion
6. Conclusion
7. References
IntroSort: A hybrid sorting algorithm that begins with Quicksort and switches to
Heapsort when the recursion depth exceeds a certain threshold to avoid Quicksort's
worst-case performance. It also uses Insertion Sort for small arrays.
TimSort: A hybrid stable sorting algorithm derived from Merge Sort and Insertion
Sort. It's designed to perform well on many kinds of real-world data by efficiently
handling already sorted portions within the data.
2. Implementation Details
STL std::sort: A highly optimized hybrid sort that combines Quicksort, Heapsort,
and Insertion Sort.
IntroSort: Starts with Quicksort, switches to Heapsort at a certain recursion depth,
and uses Insertion Sort for small partitions.
Merge Sort: A stable, divide-and-conquer algorithm that recursively splits the array
and merges sorted halves.
Quick Sort: A divide-and-conquer algorithm that partitions the array around a
pivot element.
Tim Sort: A hybrid sorting algorithm that merges runs of already sorted data using
Merge Sort and Insertion Sort.
1. STL std::sort
#include<bits/stdc++.h>
using namespace std;
int main(){
int n;
cin>>n;
srand(time(0));
vector<int> rand_vec;
for (int i = 0; i < 100000; ++i) {
int rand_num = rand() % 9000 + 1000;
rand_vec.push_back(rand_num);
}
sort(rand_vec.begin(),rand_vec.end());
//for(int i=0;i<rand_vec.size();i++)cout<<rand_vec[i]<<"
";
cout<<endl;
}
//time - 59ms
How It Sorts: The std::sort function in the STL (Standard Template Library) is based
on Introsort, a hybrid algorithm that starts with Quicksort. When the recursion depth
exceeds a certain threshold (logarithmic based on the number of elements), it switches to
Heapsort to avoid the worst-case performance of Quicksort. For small partitions, it
employs Insertion Sort, which is efficient for small datasets.
2. IntroSort:
#include<bits/stdc++.h>
using namespace std;
template <typename rand_acc_it>
void insertionSort(rand_acc_it begin, rand_acc_it end) {
for (rand_acc_it i = begin; i != end; ++i) {
for (rand_acc_it j = i; j != begin && *j < *(j - 1);
--j) {
std::iter_swap(j, j - 1);
}
}
}
if (depth_limit == 0) {
heapSort(begin, end);
return;
}
int main(){
int n;
cin>>n;
srand(time(0));
vector<int> rand_vec;
for (int i = 0; i < 100000; ++i) {
int rand_num = rand() % 9000 + 1000;
rand_vec.push_back(rand_num);
}
introSort(rand_vec.begin(),rand_vec.end());
//for(int i=0;i<rand_vec.size();i++)cout<<rand_vec[i]<<"
";
cout<<endl;
}
//time - 78ms
How It Sorts:
1. Quicksort: It starts with Quicksort, selecting a pivot element and partitioning the
array into elements less than and greater than the pivot.
2. Switch to Heapsort: If the recursion depth exceeds a predefined limit (twice the
logarithm of the number of elements), it switches to Heapsort to ensure O(n log n)
performance.
3. Insertion Sort for Small Arrays: For small partitions (typically 16 elements or
fewer), it uses Insertion Sort due to its efficiency in such scenarios.
#include<bits/stdc++.h>
using namespace std;
int main(){
int n;
cin>>n;
srand(time(0));
vector<int> rand_vec;
for (int i = 0; i < 100000; ++i) {
int rand_num = rand() % 9000 + 1000;
rand_vec.push_back(rand_num);
}
mergeSort(rand_vec,0,99999);
//for(int i=0;i<rand_vec.size();i++)cout<<rand_vec[i]<<"
";
cout<<endl;
}
//time - 82ms
How It Sorts:
1. Divide: Recursively divides the array into two halves until each half contains a
single element.
2. Conquer: Merges the sorted halves back together by comparing the elements and
arranging them in order.
Optimizations: Uses additional memory to store the temporary arrays during merging.
Stability: Merge Sort is stable, meaning it maintains the relative order of equal
elements.
4. Quick Sort:
#include<bits/stdc++.h>
using namespace std;
int main(){
int n;
cin>>n;
srand(time(0));
vector<int> rand_vec;
for (int i = 0; i < 100000; ++i) {
int rand_num = rand() % 9000 + 1000;
rand_vec.push_back(rand_num);
}
quickSort(rand_vec,0,99999);
//for(int i=0;i<rand_vec.size();i++)cout<<rand_vec[i]<<"
";
cout<<endl;
}
//time - 63ms
How It Sorts:
1. Partition: Selects a pivot element and partitions the array such that elements less
than the pivot are on the left and elements greater than the pivot are on the right.
2. Recursion: Recursively applies the partitioning process to the left and right
subarrays.
#include<bits/stdc++.h>
using namespace std;
int i = 0, j = 0, k = l;
while (i < len1 && j < len2) {
if (left[i] <= right[j]) {
arr[k] = left[i];
i++;
} else {
arr[k] = right[j];
j++;
}
k++;
}
int main(){
int n;
cin>>n;
srand(time(0));
vector<int> rand_vec;
for (int i = 0; i < 100000; ++i) {
int rand_num = rand() % 9000 + 1000;
rand_vec.push_back(rand_num);
}
timSort(rand_vec);
//for(int i=0;i<rand_vec.size();i++)cout<<rand_vec[i]<<" ";
cout<<endl;
}
How It Sorts:
1. Runs: Divides the array into small segments called "runs" and sorts each run using
Insertion Sort. The minimum size of each run is typically set to 32 or 64.
2. Merge: Merges the sorted runs using a technique similar to Merge Sort.
Optimizations:
Runs: Efficiently handles already sorted or nearly sorted data by exploiting natural
runs in the array.
Insertion Sort for Runs: Uses Insertion Sort for sorting individual runs due to its
efficiency on small arrays.
Purpose of Minimum Merge Size: The MIN_MERGE parameter defines the minimum
size of the runs. Using a small MIN_MERGE ensures that each run is small enough for
Insertion Sort to be efficient, but large enough to minimize the overhead of frequent
merging. This balances the trade-offs between the insertion and merge phases, optimizing
overall performance.
3. Experimental Setup
Test Environment
Dataset
The execution times for each sorting algorithm based on the implementation are as follows:
5. Discussion
Efficiency Comparison
STL std::sort is the most efficient due to its hybrid algorithm and low-level
optimizations.
Quick Sort shows good average performance but can degrade in worst-case
scenarios.
Tim Sort is very efficient on real-world data due to its ability to handle partially
sorted data efficiently.
Merge Sort performs well for stable sorting but has higher memory overhead.
IntroSort balances worst-case performance and average-case efficiency.
Tim Sort excels in scenarios with partially sorted data.
6. Conclusion
Summary of Findings
7. References