CSC 344 - Algorithms and Complexity: Lecture #3 - Internal Sorting
CSC 344 - Algorithms and Complexity: Lecture #3 - Internal Sorting
Complexity
Lecture #3 – Internal Sorting
Selection Sort
// selectionSort() - The selection sort where we
// seek the ith smallest value and
// swap it into its proper place
void selectionSort(int x[], int n) {
int i, j, min;
• Time efficiency:
• Space efficiency:
• Stability:
Insertion Sort
• To sort array A[0..n-1], sort A[0..n-2] recursively and
then insert A[n-1] in its proper place among the
sorted A[0..n-2]
• Usually implemented bottom up (nonrecursively)
• Example: Sort 6, 4, 1, 8, 5
6|4 1 8 5
4 6|1 8 5
1 4 6|8 5
1 4 6 8|5
1 4 5 6 8
Pseudocode of Insertion Sort
x[j+1] = temp;
}
}
Divide and Conquer
• The most-well known algorithm design
strategy:
1. Divide instance of problem into two or more
smaller instances
2. Solve smaller instances recursively
3. Obtain solution to original (larger) instance by
combining these solutions
Mergesort
• Split array A[0..n-1] in two about equal halves and make
copies of each half in arrays B and C
• Sort arrays B and C recursively
• Merge sorted arrays B and C into array A as follows:
– Repeat the following until no elements remain in one of the arrays:
– Compare the first elements in the remaining unprocessed portions of
the arrays
– Copy the smaller of the two into A, while incrementing the index
indicating the unprocessed portion of that array
– Once all elements in one of the arrays are processed, copy the
remaining unprocessed elements from the other array into A.
Pseudocode of Mergesort
Pseudocode of Merge
Mergesort Example
8 3 2 9 7 1 5 4
8 3 2 9 7 1 5 4
8 3 2 9 71 5 4
8 3 2 9 7 1 5 4
3 8 2 9 1 7 4 5
2 3 8 9 1 4 5 7
1 2 3 4 5 7 8 9
Analysis of Mergesort
• All cases have same efficiency: Θ(n log n)
• Number of comparisons in the worst case is close to
theoretical minimum for comparison-based sorting:
log2 n! ≈ n log2 n - 1.44n
• Space requirement: Θ(n) (not in-place)
• Can be implemented without recursion (bottom-up)
Mergesort
// mergeSort() - A recursive version of the merge
// sort, where the array is divided
// into smaller and smaller subarrays
// and then merged together in order
void mergeSort(int x[], int n) {
int *y, *z;
if (n > 1) {
// Set up arrays of the required size
y = new int[n/2];
z = new int[n/2];
Merge
// merge() - Merge the two subarrays together
// Maintaining the order
void merge(int b[], int bSize,
int c[], int cSize,
int a[], int aSize) {
int i = 0, j = 0, k = 0;
if (i == bSize) {
for (int m = j; m < cSize; m++)
a[bSize+m] = c[m];
}
else {
for (int m = i; m < bSize; m++)
a[cSize+m] = b[m];
}
}
Quicksort
• Select a pivot (partitioning element) – here, the first element
• Rearrange the list so that all the elements in the first s
positions are smaller than or equal to the pivot and all the
elements in the remaining n-s positions are larger than or equal
to the pivot (see next slide for an algorithm)
A[i]≤p A[i]≥p
• Exchange the pivot with the last element in the first (i.e., ≤)
subarray — the pivot is now in its final position
• Sort the two subarrays recursively
Quicksort Example
5 3 1 9 8 2 4 7
2 3 1 4 5 8 9 7
1 2 3 4 5 7 8 9
1 2 3 4 5 7 8 9
QuickSort
// quickSort() - Call the recursive quick
// sort method
void quickSort(int x[], int n) {
quick(x, 0, n-1);
}
Quick
// quick() - Place the pivot in its proper
// place and recursive sort
// every on either side of it
void quick(int x[], int low, int high) {
int pivotPlace;
pivot = x[low];
i = low;
j = high+1;
do {
j = j - 1;
} while (x[j] > pivot);
swap(x[i], x[j]);
} while (i < j);
return j;
}
Swap
// swap() - Swap the two parameter's values
void swap(int &a, int &b) {
int temp;
temp = a;
a = b;
b = temp;
}
Analysis of Quicksort
• Best case: split in the middle — Θ(n log n)
• Worst case: sorted array! — Θ(n2)
• Average case: random arrays — Θ(n log n)
Analysis of Quicksort
• Improvements:
– better pivot selection: median of three partitioning
– switch to insertion sort on small subfiles
– elimination of recursion
• These combine to 20-25% improvement
• Considered the method of choice for internal
sorting of large files (n ≥ 10000)
Heaps and Heapsort
• Definition - A heap is a binary tree with keys at its nodes (one
key per node) such that:
• It is essentially complete, i.e., all its levels are full except
possibly the last level, where only some rightmost keys may
be missing
• NB: Heap’s elements are ordered top down (along any path
down from its root), but they are not ordered left to right
Some Important Properties of a Heap
• Given n, there exists a unique binary tree with
n nodes that is essentially complete, with h =
log2 n
• The root contains the largest key
• The subtree rooted at any node of a heap is
also a heap
• A heap can be represented as an array
5 3 9 5 3 1 4 2
1 4 2
9 7 > 9 8 9 8
6 5 8 6 5 7 6 5 7
2 9 9
9 8 > 2 8 > 6 8
6 5 7 6 5 7 2 5 7
Heapsort
• Stage 1: Construct a heap for a given list of n
keys
• Stage 2: Repeat operation of root removal n-1
times:
– Exchange keys in the root and in the last
(rightmost) leaf
– Decrease heap size by 1
– If necessary, swap new root with larger child until
the heap condition holds
Analysis of Heapsort
• Stage 1: Build heap for a given list of n keys worst-case
C(n) = Σ 2(h-i) 2i = 2 ( n – log2(n + 1)) ∈ Θ(n)
i=0
# nodes at level i
• Stage 2: Repeat operation of root removal n-1 times (fix heap)
worst-case
n-1
6 8 > 6 10 > 6 9
2 5 7 10 2 5 7 8 2 5 7 8
Heapsort
void heapSort(int *a, int count)
{
int start, end;
/*
* heapify – Rearrange the element so that the
* father's value is greater than either son
*/
for (start = (count-2)/2; start >=0; start--) {
siftDown( a, start, count);
}
Bubble Sort
// bubbleSort() - A bubble sort function
void bubbleSort(int x[], int n) {
bool switched; // Have we switched them
// this time?
After Pass 1 13 4 25 14 1 29 18 31
After Pass 2 4 13 14 1 25 18 29 31
After Pass 3 4 13 1 14 18 25 29 31
After Pass 4 4 1 13 14 18 25 29 31
After Pass 5 1 4 13 14 18 25 29 31
Final scan to
After Pass 6 1 4 13 14 18 25 29 31 confirm its in
order
Bubble Sort vs Cocktail Shaker Sort
• A large value at the beginning of the array will move all the
way to the end in one pass if the scan goes from beginning to
end.
• A small value at the end of the array will move all the way to
the beginning in one pass if the scan goes from end to
beginning
• In both cases the reverse will require n passes.
• By scanning from beginning to end and then end to beginning
removes this dependence. We call such a sort the cocktail
shaker sort.
// Then we bubble up
for (j = n - i - 2; j > 0; --j)
if (x[j] > x[j+1]) {
// If so, swap them
switched = true;
temp = x[j];
x[j] = x[j+1];
x[j+1] = temp;
}
After Pass 1a 13 4 25 14 1 29 18 31
After Pass 1b 13 1 4 25 14 18 29 31
After Pass 2a 1 4 13 14 18 25 29 31
Final scan to
After Pass 2b 1 4 13 14 18 25 29 31 confirm its in
order