Data Structures
Sorting Algorithms
1
Sorting means . . .
The values stored in an array have
keys of a type for which the relational
operators are defined. (We also
assume unique keys.)
Sorting rearranges the elements into
either ascending or descending order
within the array. (We’ll use ascending
order.)
2
Straight Selection Sort
values [ 0 ] Divides the array into two
36 parts: already sorted, and
[1] not yet sorted.
24
[2] On each pass, finds the
10 smallest of the unsorted
elements, and swaps it into
[3]
6 its correct place, thereby
increasing the number of
[4]
12 sorted elements by one.
3
Selection Sort: Pass One
values [ 0 ] 36 U
[1] N
24 S
[2] O
10 R
[3]
6 T
E
[4] D
12
4
Selection Sort: End Pass One
values [ 0 ] 6 SORTED
[1] U
24
N
[2] S
10 O
[3] R
36 T
[4] E
12 D
5
Selection Sort: Pass Two
values [ 0 ] 6 SORTED
[1] U
24
N
[2] S
10 O
[3] R
36 T
[4] E
12 D
6
Selection Sort: End Pass Two
values [ 0 ] 6
SORTED
[1]
10
[2] U
24 N
S
[3] O
36 R
T
[4] E
12 D
7
Selection Sort: Pass Three
values [ 0 ] 6
SORTED
[1]
10
[2] U
24 N
S
[3] O
36 R
T
[4] E
12 D
8
Selection Sort: End Pass Three
values [ 0 ] 6 S
O
[1] R
10 T
[2] E
12 D
[3]
36
UNSORTED
[4]
24
9
Selection Sort: Pass Four
values [ 0 ] 6 S
O
[1] R
10 T
[2] E
12 D
[3]
36
UNSORTED
[4]
24
10
Selection Sort: End Pass Four
values [ 0 ] 6
S
[1] O
10
[2]
R
12 T
[3]
24 E
D
[4]
36
11
Selection Sort:
How many comparisons?
values [ 0 ] 6 4 compares for values[0]
[1]
10 3 compares for values[1]
[2]
12 2 compares for values[2]
[3] 1 compare for values[3]
24
[4] = 4 + 3 + 2 + 1
36
12
For selection sort in general
The number of comparisons when the
array contains N elements is
Sum = (N-1) + (N-2) + . . . + 2 + 1
13
Notice that . . .
Sum = (N-1) + (N-2) + . . . + 2 + 1
Sum = 1 + 2 + . . . + (N-2) + (N-1)
Sum = N * (N-1)
2 14
For selection sort in general
The number of comparisons when the
array contains N elements is
Sum = (N-1) + (N-2) + . . . + 2 + 1
Sum = N * (N-1) /2
Sum = .5 N2 - .5 N
Sum = O(N2)
15
template <class ItemType >
int MinIndex ( ItemType values [ ] , int start , int end )
// Post: Function value = index of the smallest value in
// values [start] . . values [end].
{
int indexOfMin = start ;
for ( int index = start + 1 ; index <= end ; index++ )
if ( values [ index ] < values [ indexOfMin ] )
indexOfMin = index ;
return indexOfMin;
}
16
template <class ItemType >
void SelectionSort ( ItemType values [ ] , int numValues )
// Post: Sorts array values[0 . . numValues-1 ] into ascending
// order by key
{
int endIndex = numValues - 1 ;
for ( int current = 0 ; current < endIndex ; current++ )
Swap ( values [ current ] ,
values [ MinIndex ( values, current, endIndex ) ] ) ;
}
17
Bubble Sort
Compares neighboring
values [ 0 ] 36 pairs of array elements,
starting with the last array
[1] element, and swaps
24
neighbors whenever they
[2] are not in correct order.
10
[3] On each pass, this causes
6 the smallest element to
[4] “bubble up” to its correct
12 place in the array.
18
template <class ItemType >
void BubbleUp ( ItemType values [ ] , int start , int end )
// Post: Neighboring elements that were out of order have been
// swapped between values [start] and values [end],
// beginning at values [end].
{
for ( int index = end ; index > start ; index-- )
if (values [ index ] < values [ index - 1 ] )
Swap ( values [ index ], values [ index - 1 ] ) ;
}
19
template <class ItemType >
void BubbleSort ( ItemType values [ ] , int numValues )
// Post: Sorts array values[0 . . numValues-1 ] into ascending
// order by key
{
int current = 0 ;
while ( current < numValues - 1 )
BubbleUp ( values , current , numValues - 1 ) ;
current++ ;
}
20
Insertion Sort
values [ 0 ] One by one, each as yet
36 unsorted array element is
[1] inserted into its proper
24 place with respect to the
already sorted elements.
[2]
10
On each pass, this causes
[3]
6 the number of already
sorted elements to increase
[4]
12 by one.
21
Insertion Sort
Works like someone who
“inserts” one more card at
a time into a hand of cards
that are already sorted.
To insert 12, we need to
make room for it by moving
first 36 and then 24.
22
Insertion Sort
Works like someone who
“inserts” one more card at
a time into a hand of cards
that are already sorted.
To insert 12, we need to
make room for it by moving
first 36 and then 24.
23
Insertion Sort
Works like someone who
“inserts” one more card at
a time into a hand of cards
that are already sorted.
To insert 12, we need to
make room for it by moving
first 36 and then 24.
24
Insertion Sort
Works like someone who
“inserts” one more card at
a time into a hand of cards
that are already sorted.
To insert 12, we need to
make room for it by moving
first 36 and then 24.
25
template <class ItemType >
void InsertItem ( ItemType values [ ] , int start , int end )
// Post: Elements between values [start] and values [end]
// have been sorted into ascending order by key.
{
bool finished = false ;
int current = end ;
bool moreToSearch = ( current != start ) ;
while ( moreToSearch && !finished )
{
if (values [ current ] < values [ current - 1 ] )
{
Swap ( values [ current ], values [ current - 1 ] ) ;
current-- ;
moreToSearch = ( current != start );
}
else
finished = true ;
} 26
}
template <class ItemType >
void InsertionSort ( ItemType values [ ] , int numValues )
// Post: Sorts array values[0 . . numValues-1 ] into ascending
// order by key
{
for ( int count = 0 ; count < numValues ; count++ )
InsertItem ( values , 0 , count ) ;
27
Sorting Algorithms
and Number of Comparisons
Simple Sorts
Straight Selection Sort
Bubble Sort O(N2)
Insertion Sort
More Complex Sorts
Quick Sort
Merge Sort O(N*log N)
Heap Sort
28
Recall that . . .
A heap is a binary tree that satisfies these
special SHAPE and ORDER properties:
Its shape must be a complete binary tree.
For each node in the heap, the value
stored in that node is greater than or
equal to the value in each of its children.
29
The largest element
in a heap is always found in the root node
root
70
60 12
40 30 8 10
30
The heap can be stored
in an array
values
[0] root
70
[1] 60 70
0
[2] 12
60 12
[3] 40
1 2
[4] 30 40 30 8 10
[5] 3 4 5 6
8
[6] 10
31
Heap Sort Approach
First, make the unsorted array into a heap by
satisfying the order property. Then repeat the
steps below until there are no more unsorted
elements.
Take the root (maximum) element off the heap
by swapping it into its correct place in the
array at the end of the unsorted elements.
Reheap the remaining unsorted elements.
(This puts the next-largest element into the
root position).
32
After creating the original heap
values
[0] root
70
[1] 60 70
0
[2] 12
60 12
[3] 40
1 2
[4] 30 40 30 8 10
[5] 3 4 5 6
8
[6] 10
33
Swap root element into last place
in unsorted array
values
[0] root
70
[1] 60 70
0
[2] 12
60 12
[3] 40
1 2
[4] 30 40 30 8 10
[5] 3 4 5 6
8
[6] 10
34
After swapping root element
into its place
values
[0] root
10
[1] 60 10
0
[2] 12
60 12
[3] 40
1 2
[4] 30 40 30 8 70
[5] 3 4 5 6
8
[6] 70 NO NEED TO CONSIDER AGAIN
35
After reheaping remaining
unsorted elements
values
[0] root
60
[1] 40 60
0
[2] 12
40 12
[3] 10
1 2
[4] 30 10 30 8 70
[5] 3 4 5 6
8
[6] 70
36
Swap root element into last place
in unsorted array
values
[0] root
60
[1] 40 60
0
[2] 12
40 12
[3] 10
1 2
[4] 30 10 30 8 70
[5] 3 4 5 6
8
[6] 70
37
After swapping root element
into its place
values
[0] root
8
[1] 40 8
0
[2] 12
40 12
[3] 10
1 2
[4] 30 10 30 60 70
[5] 3 4 5 6
60
[6] 70 NO NEED TO CONSIDER AGAIN
38
After reheaping remaining
unsorted elements
values
[0] root
40
[1] 30 40
0
[2] 12
30 12
[3] 10
1 2
[4] 8 10 8 60 70
[5] 3 4 5 6
60
[6] 70
39
Swap root element into last place
in unsorted array
values
[0] root
40
[1] 30 40
0
[2] 12
30 12
[3] 10
1 2
[4] 8 10 8 60 70
[5] 3 4 5 6
60
[6] 70
40
After swapping root element
into its place
values
[0] root
8
[1] 30 8
0
[2] 12
30 12
[3] 10
1 2
[4] 40 10 40 60 70
[5] 3 4 5 6
60
[6] 70 NO NEED TO CONSIDER AGAIN
41
After reheaping remaining
unsorted elements
values
[0] root
30
[1] 10 30
0
[2] 12
10 12
[3] 8
1 2
[4] 40 8 40 60 70
[5] 3 4 5 6
60
[6] 70
42
Swap root element into last place
in unsorted array
values
[0] root
30
[1] 10 30
0
[2] 12
10 12
[3] 8
1 2
[4] 40 8 40 60 70
[5] 3 4 5 6
60
[6] 70
43
After swapping root element
into its place
values
[0] root
8
[1] 10 8
0
[2] 12
10 12
[3] 30
1 2
[4] 40 30 40 60 70
[5] 3 4 5 6
60
[6] 70 NO NEED TO CONSIDER AGAIN
44
After reheaping remaining
unsorted elements
values
[0] root
12
[1] 10 12
0
[2] 8
10 8
[3] 30
1 2
[4] 40 30 40 60 70
[5] 3 4 5 6
60
[6] 70
45
Swap root element into last place
in unsorted array
values
[0] root
12
[1] 10 12
0
[2] 8
10 8
[3] 30
1 2
[4] 40 30 40 60 70
[5] 3 4 5 6
60
[6] 70
46
After swapping root element
into its place
values
[0] root
8
[1] 10 8
0
[2] 12
10 12
[3] 30
1 2
[4] 40 30 40 60 70
[5] 3 4 5 6
60
[6] 70 NO NEED TO CONSIDER AGAIN
47
After reheaping remaining
unsorted elements
values
[0] root
10
[1] 8 10
0
[2] 12
8 12
[3] 30
1 2
[4] 40 30 40 60 70
[5] 3 4 5 6
60
[6] 70
48
Swap root element into last place
in unsorted array
values
[0] root
10
[1] 8 10
0
[2] 12
8 12
[3] 30
1 2
[4] 40 30 40 60 70
[5] 3 4 5 6
60
[6] 70
49
After swapping root element
into its place
values
[0] root
8
[1] 10 8
0
[2] 12
10 12
[3] 30
1 2
[4] 40 30 40 60 70
[5] 3 4 5 6
60
[6] 70 ALL ELEMENTS ARE SORTED
50
template <class ItemType >
void HeapSort ( ItemType values [ ] , int numValues )
// Post: Sorts array values[ 0 . . numValues-1 ] into ascending
// order by key
{
int index ;
// Convert array values[ 0 . . numValues-1 ] into a heap.
for ( index = numValues/2 - 1 ; index >= 0 ; index-- )
ReheapDown ( values , index , numValues - 1 ) ;
// Sort the array.
for ( index = numValues - 1 ; index >= 1 ; index-- )
{
Swap ( values [0] , values [index] );
ReheapDown ( values , 0 , index - 1 ) ;
} 51
}
ReheapDown
template< class ItemType >
void ReheapDown ( ItemType values [ ], int root, int bottom )
// Pre: root is the index of a node that may violate the heap
// order property
// Post: Heap order property is restored between root and bottom
{
int maxChild ;
int rightChild ;
int leftChild ;
leftChild = root * 2 + 1 ;
rightChild = root * 2 + 2 ;
52
if ( leftChild <= bottom ) // ReheapDown continued
{
if ( leftChild == bottom )
maxChild = leftChild;
else
{
if (values [ leftChild ] <= values [ rightChild ] )
maxChild = rightChild ;
else
maxChild = leftChild ;
}
if ( values [ root ] < values [ maxChild ] )
{
Swap ( values [ root ] , values [ maxChild ] ) ;
ReheapDown ( maxChild, bottom ) ;
}
}
}
53
Heap Sort:
How many comparisons?
In reheap down, an element root
is compared with its 2
children (and swapped 24
with the larger). But
0
only one element at
each level makes 60 12
this comparison,
and a complete 1 2
binary tree with 30 40 8 10
N nodes has
only O(log2N) 3 4 5 6
levels. 15 6 18 70
7 8 9 10
54
Heap Sort of N elements:
How many comparisons?
O ( N * log N) comparisons
55
Using quick sort algorithm
A..Z
A..L M..Z
A..F G..L M..R S..Z
56
// Recursive quick sort algorithm
template <class ItemType >
void QuickSort ( ItemType values[ ] , int first , int last )
// Pre: first <= last
// Post: Sorts array values[ first . . last ] into ascending order
{
if ( first < last ) // general case
{ int splitPoint ;
Split ( values, first, last, splitPoint ) ;
// values [ first ] . . values[splitPoint - 1 ] <= splitVal
// values [ splitPoint ] = splitVal
// values [ splitPoint + 1 ] . . values[ last ] > splitVal
QuickSort( values, first, splitPoint - 1 ) ;
QuickSort( values, splitPoint + 1, last );
}
}; 57
Before call to function Split
splitVal = 9
GOAL: place splitVal in its proper position with
all values less than or equal to splitVal on its left
and all larger values on its right
9 20 6 18 14 3 60 11
values[first] [last]
58
After call to function Split
splitVal = 9
smaller values larger values
in left part in right part
6 3 9 18 14 20 60 11
values[first] [last]
splitVal in correct position
59
Quick Sort of N elements:
How many comparisons?
N For first call, when each of N elements
is compared to the split value
2 * N/2 For the next pair of calls, when N/2
elements in each “half” of the original
array are compared to their own split values.
4 * N/4 For the four calls when N/4 elements in each
“quarter” of original array are compared to
their own split values.
.
.
. HOW MANY SPLITS CAN OCCUR?
60
Quick Sort of N elements:
How many splits can occur?
It depends on the order of the original array elements!
If each split divides the subarray approximately in half,
there will be only log2N splits, and QuickSort is
O(N*log2N).
But, if the original array was sorted to begin with, the
recursive calls will split up the array into parts of
unequal length, with one part empty, and the
other part containing all the rest of the array except for
split value itself. In this case, there can be as many as
N-1 splits, and QuickSort is O(N2).
61
Before call to function Split
splitVal = 9
GOAL: place splitVal in its proper position with
all values less than or equal to splitVal on its left
and all larger values on its right
9 20 26 18 14 53 60 11
values[first] [last]
62
After call to function Split
splitVal = 9
no smaller values larger values
empty left part in right part with N-1 elements
9 20 26 18 14 53 60 11
values[first] [last]
splitVal in correct position
63
Radix Sort (1/5)
The radix sorting algorithm is quite different
from the others.
It treats each data element as a character
string.
– 327 as ’327’.
Strategy:
Repeatedly (right-to-left) organizes the data into
groups according to the ith character in each
element.
64
Radix Sort (2/5)
Begin by organizing the data into groups according to their
rightmost letters.
– The string in each group end with the same letter.
– The groups are order by that letter.
– The string within each group retain their relative order
from the original list of string.
Example:
You can pad numbers on the left with zeros,
making them all appear to the same length.
65
Radix Sort (3/5)
Combine the groups into one.
Next,for new groups as before, but this
time use the next-to-last digits.
66
Radix Sort (4/5)
To sort n d-digit numbers, Radix sort requires:
n moves each time it forms groups.
n moves to combine them into one group.
Totally, 2 * n * d moves.
O(n).
67
Radix Sort (5/5)
Despite its efficiency, radix sort has some
difficulties that make it inappropriate as a
general-purpose sorting algorithm.
To sort integers, you need to accommodate 10
groups (0, 1, …, 9).
Each group must be able to hold n strings.
For large n, this requirement demands
substantial memory!!
68
Mergesort
A recursive sorting algorithm.
Strategy:
Divide an array into halves.
Sort each half (by calling itself recursively).
Merge the sorted halves into one sorted
array.
Basecase: an array of one item IS
69
SORTED.
Mergesort
theArray: 8 1 4 3 2
Divide the array in half and conquer
...
sorted array: 1 4 8 2 3
Merge the halves
tempArray: 1 2 3 4 8
copy
theArray: 1 2 3 4 8
70
Mergesort
void mergesort(DataType theArray[], int first, int last)
{
if (first < last)
{ int mid = (first + last)/2; // index of midpoint
mergesort(theArray, first, mid);
mergesort(theArray, mid+1, last);
// merge the two halves
merge(theArray, first, mid, last);
} // end if
} // end mergesort
71
const int MAX_SIZE = 10000;
void merge(DataType theArray[], int first, int mid, int last)
{
Mergesort (4/)
DataType tempArray[MAX_SIZE]; // temporary array
// initialize the local indexes to indicate the subarrays
int first1 = first; // beginning of first subarray
int last1 = mid; // end of first subarray
int first2 = mid + 1; // beginning of second subarray
int last2 = last; // end of second subarray
// while both subarrays are not empty, copy the
// smaller item into the temporary array
int index = first1; // next available location in
// tempArray
for (; (first1 <= last1) && (first2 <= last2); ++index)
{
if (theArray[first1] < theArray[first2])
first1 last1 first2 last2
{ tempArray[index] = theArray[first1];
++first1;
} anArray ... 5 8 … 3 9 ... ...
else first mid last
{ tempArray[index] = theArray[first2]; index
++first2;
} // end if
tempArray ... 3 5 ... 72 ...
} // end for
Mergesort
// finish off the nonempty subarray
// finish off the first subarray, if necessary
for (; first1 <= last1; ++first1, ++index)
tempArray[index] = theArray[first1];
// finish off the second subarray, if necessary
for (; first2 <= last2; ++first2, ++index)
tempArray[index] = theArray[first2];
// copy the result back into the original array
for (index = first; index <= last; ++index)
theArray[index] = tempArray[index];
} // end merge
73
Mergesort
void mergesort(DataType theArray[], int first, int last)
{
if (first < last)
{ int mid = (first + last)/2;
mergesort(theArray, first, mid);
mergesort(theArray, mid+1, last);
merge(theArray, first, mid, last);
}
} mergesort(theArray,0,5)
mergesort(theArray,0,2) mergesort(theArray,3,5)
mergesort(theArray,0,1) mergesort(theArray,1,1)
74
Mergesort
Analysis (worst case):
The merge step of the algorithm requires
the most effort.
– Each merge step merges theArray[first…mid] and
theArray[mid+1…last].
– If the number of items in the two array segment is m:
Merging the segments requires at most:
m-1 comparisons.
m moves from the original array to the temporary array.
m moves from the temporary array back to the original
array.
Each merge requires 3*m-1 major operations.
75
Mergesort
Level 0
n Merge n items:
3*n-1 operations
or O(n)
Level 1
n/2 n/2 Merge two n/2 items:
2*(3*n/2-1) operations
3n-2 operations
n/4 n/4 n/4 n/4 Level 2 or O(n)
Each level requires
O(n) operations
1 1 1 ... 1 1 1 Level log2n
(or 1 + log2n rounded down)
Each level O(n) operations & O(log2n) levels O(n*log2n)
76