Sorting
Sorting
m. alishahid
Objectives
Insertion Sort Shell Sort Selection Sort Bubble Sort Quick Sort Heap Sort Merge Sort
Introduction
One of the most common applications in computer science is sorting , the process through which data are arranged according to their values. If data were not ordered in some way, we would spend an incredible amount of time trying to find the correct information.
Introduction
To appreciate this, imagine trying to find someones number in the telephone book if the names were not sorted in some way!
Terminology - Passes
During the sorting process, the data is traversed many times. Each traversal of the data is referred to as a sort pass . Depending on the algorithm, the sort pass may traverse the whole list or just a section of the list. The sort pass may also include the placement of one or more elements into the sorted list.
Types of Sorts
Insertion Sort Shell Sort Selection Sort Bubble Sort Quick Sort Heap Sort Merge Sort
Insertion Sort
In each pass of an insertion sort, one or more pieces of data are inserted into their correct location in an ordered list (just as a card player picks up cards and places them in his hand in order).
Insertion Sort
Sorted Unsorted
In each pass, the first element of the unsorted sublist is transferred to the sorted sublist by inserting it at the appropriate place. If we have a list of n elements, it will take, at most, n-1 passes to sort the data.
Insertion Sort
We can visualize this type of sort with the above figure. The first part of the list is the sorted portion which is separated by a conceptual wall from the unsorted portion of the list.
Insertion Sort
On our next pass, we look at the 3rd element of the list (the 1st element of the unsorted list) and place it in the sorted list in order. On our first pass, we look at the 2nd element of the list (the first element of the unsorted list) and place it in our sorted list in order.
In our look at are out those so Again, need to out look Now wefirst pass, we order, wevalues,themthey arecomparethey Thesethey are swappedof2order,swap we values. and of Since wevaluesout of compare the first swap themBecause at 2 have the next values. we 2 and then continue are so we 2 values to orderoutthe order, we swap them. that they are still in order. on previous swap data make thewith of next 2them.values.sure values, etc.
Shell Sort
Named after its creator, Donald Shell, the shell sort is an improved version of the insertion sort. In the shell sort, a list of N elements is divided into K segments where K is known as the increment. What this means is that instead of comparing adjacent values, we will compare values that are a distance K apart. We will shrink K as we run through our algorithm.
Shell Sort
Pass 1 77 21 21 21 21 62 62 62 62 62 14 14 14 14 14 List (K=5) 9 30 21 80 9 30 77 80 9 30 77 80 9 30 77 80 9 30 77 80 25 25 25 25 25 70 70 70 70 70 55 55 55 55 55 Notes Swap In order In order In order In order
Just as in the straight insertion sort, we compare 2 values and swap them if they are out of order. However, in the shell sort we compare values that are a distance K apart. Once we have completed going through the elements in our list with K=5, we decrease K and continue the process.
Shell Sort
Pass 2 21 14 14 14 14 14 14 14 14 14 14 14 14 62 62 9 9 9 9 9 9 9 9 9 9 9 14 21 21 21 21 21 21 21 21 21 21 21 21 List (K=2) 9 30 77 80 9 30 77 80 62 30 77 80 62 30 77 80 62 30 77 80 62 30 77 80 62 30 25 80 25 30 62 80 25 30 62 80 25 30 62 70 25 30 62 70 25 30 62 70 25 30 55 70 25 25 25 25 25 25 77 77 77 77 77 55 62 70 70 70 70 70 70 70 70 70 80 80 80 80 55 55 55 55 55 55 55 55 55 55 55 77 77 Notes Swap Swap In order In order In order Swap Swap In order Swap In order Swap Swap In order
Here we have reduced K to 2. Just as in the insertion sort, if we swap 2 values, we have to go back and compare the previous 2 values to make sure they are still in order.
Shell Sort
All shell sorts will terminate by running an insertion sort (i.e., K=1). However, using the larger values of K first has helped to sort our list so that the straight insertion sort will run faster.
Shell Sort
There are many schools of thought on what the increment should be in the shell sort. Also note that just because an increment is optimal on one list, it might not be optimal for another list.
Comparing the Big-O notation (for the average case) we find that:
Although this doesnt seem like much of a gain, it makes a big difference as n gets large. Note that in the worst case, the Shell sort has an efficiency of O(n2) . However, using a special incrementing technique, this worst case can be reduced to O(n1.5)
Selection Sort
Imagine some data that you can examine all at once. To sort it, you could select the smallest element and put it in its place, select the next smallest and put it in its place, etc. For a card player, this process is analogous to looking at an entire hand of cards and ordering them by selecting cards one at a time and placing them in their proper order.
Selection Sort
The selection sort follows this idea. Given a list of data to be sorted, we simply select the smallest item and place it in a sorted list. We then repeat these steps until the list is sorted.
Selection Sort
In the selection sort , the list at any moment is divided into 2 sublists, sorted and unsorted, separated by a conceptual wall. We select the smallest element from the unsorted sublist and exchange it with the element at the beginning of the unsorted data. After each selection and exchange, the wall between the 2 sublists moves increasing the number of sorted elements and decreasing the number of unsorted elements.
Selection Sort
We start search unsorted list. list for the this list for the Again, wewithcontinues until theWe searchsmallest element. We This process an the unsorted list is fully sorted. smallest element. We then exchange the smallest element (8) then exchange the smallest element (23) with the first element with unsorted list (78) the unsorted conceptual move in thethe first element inand move the list (23) andwall. the conceptual wall.
Bubble Sort
In the bubble sort , the list at any moment is divided into 2 sublists, sorted and unsorted. The smallest element is bubbled from the unsorted sublist to the sorted sublist.
Bubble Sort
23 78 45 45 56 32 8 23 78 8 32 56 8 8 We start with 32 and compare it with 56. Because 32 is less than 56, we swap the two and step down one element. We then compare 32 and 8. Because 32 is not less than 8, we do not swap these elements. We step down one element and compare 45 and 8. They are out of sequence, so we swap them and step down again. We step down again and compare 8 with 78. These two elements are swapped. Finally, 8 is compared with 23 and swapped. We then continue this process back with 56
Quick Sort
In the bubble sort, consecutive items are compared and possibly exchanged on each pass through the list. This means that many exchanges may be needed to move an element to its correct position. Quick sort is more efficient than bubble sort because a typical exchange involves elements that are far apart, so fewer exchanges are required to correctly position an element.
Quick Sort
Each iteration of the quick sort selects an element, known as the pivot , and divides the list into 3 groups:
Elements whose keys are less than (or equal to) the pivots key. The pivot element Elements whose keys are greater than (or equal to) the pivots key.
Quick Sort
The sorting then continues by quick sorting the left partition followed by quick sorting the right partition. The basic algorithm is as follows:
Quick Sort
1)
2)
Partitioning Step: Take an element in the unsorted array and determine its final location in the sorted array. This occurs when all values to the left of the element in the array are less than (or equal to) the element, and all values to the right of the element are greater than (or equal to) the element. We now have 1 element in its proper location and two unsorted subarrays. Recursive Step: Perform step 1 on each unsorted subarray.
Quick Sort
Each time step 1 is performed on a subarray, another element is placed in its final location of the sorted array, and two unsorted subarrays are created. When a subarray consists of one element, that subarray is sorted. Therefore that element is in its final location.
Quick Sort
There are several partitioning strategies used in practice (i.e., several versions of quick sort), but the one we are about to describe is known to work well. For simplicity we will choose the last element to be the pivot element. We could also chose a different pivot element and swap it with the last element in the array.
Quick Sort
Quick Sort
The index left starts at the first element and right starts at the next-to-last element. 4 8 9 0 11 5 10 7 6
left
right
We want to move all the elements smaller than the pivot to the left part of the array and all the elements larger than the pivot to the right part.
Quick Sort
We move left to the right, skipping over elements that are smaller than the pivot.
4 8 9 0 11 5 10 7 6
left
right
Quick Sort
We then move right to the left, skipping over elements that are greater than the pivot. 4 8 9 0 11 5 10 7 6
left
right
When left and right have stopped, left is on an element greater than (or equal to) the pivot and right is on an element smaller than (or equal to) the pivot.
Quick Sort
If left is to the left of right (or if left = right), those elements are swapped.
4
8 left 5 left
11
5 10 right 8 10 right
11
Quick Sort
The effect is to push a large element to the right and a small element to the left. We then repeat the process until left and right cross.
Quick Sort
1 4
5 left
5
11
8 10 right
8 10
10
Quick Sort
1 4 5
0 9 11 left right
10
0 9 11 right left
10
Quick Sort
1
0 9 11 right left
10
At this point, left and right have crossed so no swap is performed. The final part of the partitioning is to swap the pivot element with left.
4 5
0 6 11 right left
10
Quick Sort
Note that all elements to the left of the pivot are less than (or equal to) the pivot and all elements to the right of the pivot are greater than (or equal to) the pivot. Hence, the pivot element has been placed in its final sorted position.
4 5
0 6 11 right left
10
Quick Sort
We now repeat the process using the sub-arrays to the left and right of the pivot.
4 5 0 6 11 8 10 7 9
11
10
of
Quick Sort
There are more optimal ways to choose the pivot value (such as the median-of-three method). Also, when the subarrays get small, it becomes more efficient to use the insertion sort as opposed to continued use of quick sort.
If we calculate the Big-O notation we find that (in the average case):
Heap Sort
Idea: take the items that need to be sorted and insert them into the heap. By calling deleteHeap, we remove the smallest or largest element depending on whether or not we are working with a min- or max-heap, respectively. Hence, the elements are removed in ascending or descending order. Efficiency: O(nlog2n)
Merge Sort
Idea:
Take the array you would like to sort and divide it in half to create 2 unsorted subarrays. Next, sort each of the 2 subarrays. Finally, merge the 2 sorted subarrays into 1 sorted array.
Efficiency: O(nlog2n)
Merge Sort
Merge Sort
Although the merge step produces a sorted array, we have overlooked a very important step. How did we sort the 2 halves before performing the merge step? We used merge sort!
Merge Sort
By continually calling the merge sort algorithm, we eventually get a subarray of size 1. Since an array with only 1 element is clearly sorted, we can back out and merge 2 arrays of size 1.
Merge Sort
Merge Sort
2 input arrays (arrayA and arrayB) An ouput array (arrayC) 3 position holders (indexA, indexB, indexC), which are initially set to the beginning of their respective arrays.
Merge Sort
The smaller of arrayA[indexA] and arrayB[indexB] is copied into arrayC[indexC] and the appropriate position holders are advanced. When either input list is exhausted, the remainder of the other list is copied into arrayC.
Merge Sort
arrayA
1 indexA 2 indexB
13
24 26
We compare arrayA[indexA] with arrayB[indexB]. Whichever value is smaller is placed into arrayC[indexC]. 1 < 2 so we insert arrayA[indexA] into arrayC[indexC]
arrayB
15
27 38
arrayC
indexC
Merge Sort
arrayA
13 indexA 2 15
24 26
2 < 13 so we insert arrayB[indexB] into arrayC[indexC]
arrayB
27 38
indexB 1 indexC
arrayC
Merge Sort
arrayA
13 indexA
24 26
13 < 15 so we insert arrayA[indexA] into arrayC[indexC]
arrayB
15 indexB
27 38
arrayC
1 2 indexC
Merge Sort
arrayA
1 13
24 indexA
26
15 < 24 so we insert arrayB[indexB] into arrayC[indexC]
arrayB
15 indexB
27 38
arrayC
1 2
13 indexC
Merge Sort
arrayA
1 13
24 indexA
26
24 < 27 so we insert arrayA[indexA] into arrayC[indexC]
arrayB
2 15
27 indexB
38
arrayC
1 2 13
15 indexC
Merge Sort
arrayA
1 13 24
26 indexA
26 < 27 so we insert arrayA[indexA] into arrayC[indexC]
arrayB
2 15
27 indexB
38
arrayC
1 2 13 15 24 indexC
Merge Sort
arrayA
1 13 24 26
Since we have exhausted one of the arrays, arrayA, we simply copy the remaining items from the other array, arrayB, into arrayC
arrayB
2 15
27 indexB
38
arrayC
1 2 13 15 24
26 indexC
Merge Sort
arrayA
1 13 24 26
arrayB
2 15 27 38
arrayC
1 2 13
15 24 26 27 38
Efficiency Summary
Sort Insertion Shell Selection Bubble Quick Heap Merge Worst Case O(n2) O(n1.5) O(n2) O(n2) O(n2) O(nlog2n) O(nlog2n) Average Case O(n2) O(n1.25) O(n2) O(n2) O(nlog2n) O(nlog2n) O(nlog2n)