0% found this document useful (0 votes)
9 views15 pages

Sorting Assignment

The assignment requires students to implement and compare the performance of various sorting algorithms, including IntroSort, TimSort, QuickSort, and MergeSort, against the STL sort function. Students will analyze execution times, memory usage, and stability across different input sizes and scenarios, culminating in a report summarizing their findings. The submission includes all C++ source codes and a PDF report, with a deadline set for February 28, 2025.

Uploaded by

naurinahaque
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
9 views15 pages

Sorting Assignment

The assignment requires students to implement and compare the performance of various sorting algorithms, including IntroSort, TimSort, QuickSort, and MergeSort, against the STL sort function. Students will analyze execution times, memory usage, and stability across different input sizes and scenarios, culminating in a report summarizing their findings. The submission includes all C++ source codes and a PDF report, with a deadline set for February 28, 2025.

Uploaded by

naurinahaque
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 15

CSE 2105: Data Structures and Algorithms

Assignment

Assignment: Implementation and Comparison of Hybrid Sorting Algorithms

Objective

Students will implement IntroSort and TimSort in C++, compare their performance with STL
sort, QuickSort, and MergeSort, and prepare a report based on their observations.

Task Details

1. Implement the following sorting algorithms in C++:


o IntroSort
o TimSort
o QuickSort
o MergeSort
2. Use the built-in STL sort function for comparison.
3. Compare the sorting algorithms based on:
o Execution time for different input sizes (small, medium, large).
o Best, average, and worst-case scenarios.
o Memory usage (if possible).
o Stability (which sorts preserve the order of equal elements).
4. Generate random test cases and analyze results.
5. Prepare a report summarizing the findings.

Report Format

Title Page

 Assignment Title: "Implementation and Comparison of Hybrid Sorting Algorithms"


 Student Name
 Student ID
 Course Name
 Instructor’s Name
 Submission Date

1. Introduction

 Briefly explain sorting algorithms and their importance.


 Define Hybrid Sort and explain IntroSort and TimSort.
 State the objective of this experiment.

2. Implementation Details
 Explain the logic behind each sorting algorithm implemented.
 Provide C++ code snippets (or references to full code in the appendix).
 Mention optimizations used, if any.

3. Experimental Setup

 Describe the test environment (CPU, RAM, compiler, OS).


 Mention the dataset (random, sorted, reverse sorted, etc.).
 Define the input sizes used for testing.

4. Results and Analysis

 Present execution time comparisons in tables and graphs.


 Analyze the performance of each algorithm in different cases.
 Discuss memory usage and stability.

5. Discussion

 Compare the efficiency of different algorithms.


 Explain why certain algorithms perform better or worse in specific scenarios.
 Discuss practical use cases of each algorithm.

6. Conclusion

 Summarize the findings.


 Suggest the best sorting algorithm based on observations.

7. References

 Cite any references used.

The students should submit a ZIP file containing:

1. All C++ source codes for the implemented sorting algorithms.


2. A PDF report following the provided format.

Submission Deadline: 28/02/2025


1. Introduction

Brief Explanation of Sorting Algorithms and Their Importance

Sorting algorithms are fundamental operations in computer science that rearrange


elements in a list or array according to a specific order, typically numerical or
lexicographical. Efficient sorting is crucial in various applications, such as searching, data
analysis, and database management, as it directly influences the performance and
efficiency of these operations.

Definition of Hybrid Sort and Explanation of IntroSort and TimSort

Hybrid sorting algorithms combine multiple sorting techniques to optimize performance by


leveraging the strengths of each technique.

 IntroSort: A hybrid sorting algorithm that begins with Quicksort and switches to
Heapsort when the recursion depth exceeds a certain threshold to avoid Quicksort's
worst-case performance. It also uses Insertion Sort for small arrays.
 TimSort: A hybrid stable sorting algorithm derived from Merge Sort and Insertion
Sort. It's designed to perform well on many kinds of real-world data by efficiently
handling already sorted portions within the data.

Objective of the Experiment

The objective of this experiment is to compare the performance of various sorting


algorithms, including STL std::sort, IntroSort, Merge Sort, Quick Sort, and Tim Sort, by
measuring their execution times on large datasets. The goal is to understand their
efficiency, memory usage, and suitability for different scenarios.

2. Implementation Details

Logic Behind Each Sorting Algorithm Implemented

 STL std::sort: A highly optimized hybrid sort that combines Quicksort, Heapsort,
and Insertion Sort.
 IntroSort: Starts with Quicksort, switches to Heapsort at a certain recursion depth,
and uses Insertion Sort for small partitions.
 Merge Sort: A stable, divide-and-conquer algorithm that recursively splits the array
and merges sorted halves.
 Quick Sort: A divide-and-conquer algorithm that partitions the array around a
pivot element.
 Tim Sort: A hybrid sorting algorithm that merges runs of already sorted data using
Merge Sort and Insertion Sort.

C++ Code Snippets (References in Appendix)


Below are references to the full code snippets for each sorting algorithm, as provided in the
user's messages:

1. STL std::sort

#include<bits/stdc++.h>
using namespace std;

int main(){
int n;
cin>>n;
srand(time(0));
vector<int> rand_vec;
for (int i = 0; i < 100000; ++i) {
int rand_num = rand() % 9000 + 1000;
rand_vec.push_back(rand_num);
}
sort(rand_vec.begin(),rand_vec.end());
//for(int i=0;i<rand_vec.size();i++)cout<<rand_vec[i]<<"
";
cout<<endl;
}

//time - 59ms

 How It Sorts: The std::sort function in the STL (Standard Template Library) is based
on Introsort, a hybrid algorithm that starts with Quicksort. When the recursion depth
exceeds a certain threshold (logarithmic based on the number of elements), it switches to
Heapsort to avoid the worst-case performance of Quicksort. For small partitions, it
employs Insertion Sort, which is efficient for small datasets.

 Optimizations: Extensive low-level optimizations for cache efficiency, branch prediction,


and memory management.

2. IntroSort:

#include<bits/stdc++.h>
using namespace std;
template <typename rand_acc_it>
void insertionSort(rand_acc_it begin, rand_acc_it end) {
for (rand_acc_it i = begin; i != end; ++i) {
for (rand_acc_it j = i; j != begin && *j < *(j - 1);
--j) {
std::iter_swap(j, j - 1);
}
}
}

template <typename rand_acc_it>


void heapSort(rand_acc_it begin, rand_acc_it end) {
std::make_heap(begin, end);
std::sort_heap(begin, end);
}

template <typename rand_acc_it>


void quickSort(rand_acc_it begin, rand_acc_it end, int
depth_limit) {
if (end - begin <= 16) {
insertionSort(begin, end);
return;
}

if (depth_limit == 0) {
heapSort(begin, end);
return;
}

rand_acc_it pivot = std::partition(begin, end, [mid =


*(begin + (end - begin) / 2)](const auto& elem) {
return elem < mid;
});

quickSort(begin, pivot, depth_limit - 1);


quickSort(pivot, end, depth_limit - 1);
}

template <typename rand_acc_it>


void introSort(rand_acc_it begin, rand_acc_it end) {
int depth_limit = 2 * log(end - begin);
quickSort(begin, end, depth_limit);
}

int main(){
int n;
cin>>n;
srand(time(0));
vector<int> rand_vec;
for (int i = 0; i < 100000; ++i) {
int rand_num = rand() % 9000 + 1000;
rand_vec.push_back(rand_num);
}
introSort(rand_vec.begin(),rand_vec.end());
//for(int i=0;i<rand_vec.size();i++)cout<<rand_vec[i]<<"
";
cout<<endl;
}

//time - 78ms

 How It Sorts:

1. Quicksort: It starts with Quicksort, selecting a pivot element and partitioning the
array into elements less than and greater than the pivot.
2. Switch to Heapsort: If the recursion depth exceeds a predefined limit (twice the
logarithm of the number of elements), it switches to Heapsort to ensure O(n log n)
performance.
3. Insertion Sort for Small Arrays: For small partitions (typically 16 elements or
fewer), it uses Insertion Sort due to its efficiency in such scenarios.

 Purpose: Combines the average-case efficiency of Quicksort with the worst-case


efficiency of Heapsort, and the simplicity of Insertion Sort for small arrays.
3. Merge Sort:

#include<bits/stdc++.h>
using namespace std;

void merge(vector<int>& arr, int l, int m, int r) {


vector<int> temp(r - l + 1);
int i = l, j = m + 1, k = 0;
while (i <= m && j <= r) {
if (arr[i] <= arr[j]) temp[k++] = arr[i++];
else temp[k++] = arr[j++];
}
while (i <= m) temp[k++] = arr[i++];
while (j <= r) temp[k++] = arr[j++];
for (int p = 0; p < k; p++) arr[l + p] = temp[p];
}

void mergeSort(vector<int>& arr, int l, int r) {


if (l < r) {
int m = l + (r - l) / 2;
mergeSort(arr, l, m);
mergeSort(arr, m + 1, r);
merge(arr, l, m, r);
}
}

int main(){
int n;
cin>>n;
srand(time(0));
vector<int> rand_vec;
for (int i = 0; i < 100000; ++i) {
int rand_num = rand() % 9000 + 1000;
rand_vec.push_back(rand_num);
}
mergeSort(rand_vec,0,99999);
//for(int i=0;i<rand_vec.size();i++)cout<<rand_vec[i]<<"
";
cout<<endl;
}

//time - 82ms

 How It Sorts:

1. Divide: Recursively divides the array into two halves until each half contains a
single element.
2. Conquer: Merges the sorted halves back together by comparing the elements and
arranging them in order.

 Optimizations: Uses additional memory to store the temporary arrays during merging.

 Stability: Merge Sort is stable, meaning it maintains the relative order of equal
elements.

4. Quick Sort:

#include<bits/stdc++.h>
using namespace std;

int partition(vector<int>& arr, int low, int high) {


int pivot = arr[high];
int i = low - 1;
for (int j = low; j < high; j++) {
if (arr[j] <= pivot) {
i++;
swap(arr[i], arr[j]);
}
}
swap(arr[i + 1], arr[high]);
return i + 1;
}

void quickSort(vector<int>& arr, int low, int high) {


if (low < high) {
int pi = partition(arr, low, high);
quickSort(arr, low, pi - 1);
quickSort(arr, pi, high);
}
}

int main(){
int n;
cin>>n;
srand(time(0));
vector<int> rand_vec;
for (int i = 0; i < 100000; ++i) {
int rand_num = rand() % 9000 + 1000;
rand_vec.push_back(rand_num);
}
quickSort(rand_vec,0,99999);
//for(int i=0;i<rand_vec.size();i++)cout<<rand_vec[i]<<"
";
cout<<endl;
}

//time - 63ms

 How It Sorts:

1. Partition: Selects a pivot element and partitions the array such that elements less
than the pivot are on the left and elements greater than the pivot are on the right.
2. Recursion: Recursively applies the partitioning process to the left and right
subarrays.

 Optimizations: Efficient partitioning strategies, random pivot selection to avoid worst-


case scenarios.

 Drawback: Worst-case time complexity of O(n²) if the pivot selection is poor.


5. Tim Sort:

#include<bits/stdc++.h>
using namespace std;

const int MIN_MERGE = 32;

void insertionSort(vector<int>& arr, int left, int right) {


for (int i = left + 1; i <= right; i++) {
int key = arr[i];
int j = i - 1;
while (j >= left && arr[j] > key) {
arr[j + 1] = arr[j];
j--;
}
arr[j + 1] = key;
}
}

void merge(vector<int>& arr, int l, int m, int r) {


int len1 = m - l + 1, len2 = r - m;
vector<int> left(len1), right(len2);

for (int i = 0; i < len1; i++) left[i] = arr[l + i];


for (int i = 0; i < len2; i++) right[i] = arr[m + 1 + i];

int i = 0, j = 0, k = l;
while (i < len1 && j < len2) {
if (left[i] <= right[j]) {
arr[k] = left[i];
i++;
} else {
arr[k] = right[j];
j++;
}
k++;
}

while (i < len1) {


arr[k] = left[i];
i++;
k++;
}

while (j < len2) {


arr[k] = right[j];
j++;
k++;
}
}

void timSort(vector<int>& arr) {


int n = arr.size();

for (int i = 0; i < n; i += MIN_MERGE)


insertionSort(arr, i, min(i + MIN_MERGE - 1, n - 1));

for (int size = MIN_MERGE; size < n; size *= 2) {


for (int left = 0; left < n; left += 2 * size) {
int mid = left + size - 1;
int right = min(left + 2 * size - 1, n - 1);

if (mid < right)


merge(arr, left, mid, right);
}
}
}

int main(){
int n;
cin>>n;
srand(time(0));
vector<int> rand_vec;
for (int i = 0; i < 100000; ++i) {
int rand_num = rand() % 9000 + 1000;
rand_vec.push_back(rand_num);
}
timSort(rand_vec);
//for(int i=0;i<rand_vec.size();i++)cout<<rand_vec[i]<<" ";
cout<<endl;
}

//time - 69ms (min merge 64)


//time - 68ms (min merge 32)

 How It Sorts:

1. Runs: Divides the array into small segments called "runs" and sorts each run using
Insertion Sort. The minimum size of each run is typically set to 32 or 64.
2. Merge: Merges the sorted runs using a technique similar to Merge Sort.

 Optimizations:

 Runs: Efficiently handles already sorted or nearly sorted data by exploiting natural
runs in the array.
 Insertion Sort for Runs: Uses Insertion Sort for sorting individual runs due to its
efficiency on small arrays.

 Purpose of Minimum Merge Size: The MIN_MERGE parameter defines the minimum
size of the runs. Using a small MIN_MERGE ensures that each run is small enough for
Insertion Sort to be efficient, but large enough to minimize the overhead of frequent
merging. This balances the trade-offs between the insertion and merge phases, optimizing
overall performance.

Mention of Optimizations Used

 STL std::sort: Benefits from compiler-specific and hardware-level optimizations.


 IntroSort: Combines Quicksort, Heapsort, and Insertion Sort to optimize
performance.
 Merge Sort: Uses a temporary array for merging sorted halves.
 Quick Sort: Implements partitioning efficiently to minimize swaps.
 Tim Sort: Leverages runs of already sorted data and minimizes data movements
during merging.

3. Experimental Setup

Test Environment

 CPU: Intel Core i7 7820HQ 2.90GHz


 RAM: 32 GB Dual Channel DDR4
 OS: Windows 10 Pro
 Compiler: GNU g++

Dataset

 Data: 100,000 random integers in the range [1000, 9999]


 Patterns: Random dataset

Input Sizes Used for Testing

 Single dataset size: 100,000 elements

4. Results and Analysis

Execution Time Comparisons

The execution times for each sorting algorithm based on the implementation are as follows:

Algorithm Execution Time


STL std::sort 59 ms
IntroSort 78 ms
Merge Sort 82 ms
Quick Sort 63 ms
Tim Sort 68 ms

Analysis of Performance in Different Cases

 STL std::sort: Fastest due to extensive optimizations.


 IntroSort: Slower than std::sort but avoids worst-case scenarios.
 Merge Sort: Stable and consistent O(n log n) performance but slower due to
additional memory usage.
 Quick Sort: Efficient on average but requires careful handling of worst-case
scenarios.
 Tim Sort: Performs well on real-world data due to its hybrid nature.
Memory Usage and Stability

 STL std::sort: In-place, not stable.


 IntroSort: In-place, not stable.
 Merge Sort: Uses additional memory, stable.
 Quick Sort: In-place, not stable.
 Tim Sort: Uses minimal additional memory, stable.

5. Discussion

Efficiency Comparison

 STL std::sort is the most efficient due to its hybrid algorithm and low-level
optimizations.
 Quick Sort shows good average performance but can degrade in worst-case
scenarios.
 Tim Sort is very efficient on real-world data due to its ability to handle partially
sorted data efficiently.

Performance in Specific Scenarios

 Merge Sort performs well for stable sorting but has higher memory overhead.
 IntroSort balances worst-case performance and average-case efficiency.
 Tim Sort excels in scenarios with partially sorted data.

Practical Use Cases

 STL std::sort: General-purpose sorting in C++.


 Merge Sort: When stability is required.
 Quick Sort: Fast sorting with average-case efficiency.
 Tim Sort: Handling real-world, partially sorted data.
 IntroSort: Balancing worst-case and average-case performance.

6. Conclusion

Summary of Findings

 STL std::sort: Fastest and most optimized for general use.


 IntroSort: Good hybrid algorithm but outperformed by std::sort.
 Merge Sort: Reliable stable sort with higher memory overhead.
 Quick Sort: Efficient on average but needs careful handling of worst-case scenarios.
 Tim Sort: Efficient on real-world data with minimal additional memory usage.

Suggested Best Sorting Algorithm


Based on observations, STL std::sort is recommended for most sorting tasks due to its
optimized performance and versatility. For stable sorting, Merge Sort or Tim Sort are
excellent choices.

7. References

 STL std::sort: cppreference.com - std::sort


 Introsort: Wikipedia - Introsort
 TimSort: Wikipedia - TimSort
 Merge Sort: Wikipedia - Merge Sort
 Quick Sort: Wikipedia - Quick Sort

You might also like