Shellsort is a generalization of insertion sort that improves performance by allowing elements to jump farther ahead in the array during sorting. It works by first sorting array elements spaced far apart, then reducing the spacing between elements and sorting again, until reaching normal insertion sort spacing of adjacent elements. While its overall runtime is unknown, experiments show it is faster than O(n^2) time of basic insertion sort, likely around O(n^3/2) or O(n^7/6) time.
Download as PPT, PDF, TXT or read online on Scribd
0 ratings0% found this document useful (0 votes)
39 views
17 Shellsort
Shellsort is a generalization of insertion sort that improves performance by allowing elements to jump farther ahead in the array during sorting. It works by first sorting array elements spaced far apart, then reducing the spacing between elements and sorting again, until reaching normal insertion sort spacing of adjacent elements. While its overall runtime is unknown, experiments show it is faster than O(n^2) time of basic insertion sort, likely around O(n^3/2) or O(n^7/6) time.
Download as PPT, PDF, TXT or read online on Scribd
You are on page 1/ 11
Shellsort
Review: Insertion sort
The outer loop of insertion sort is: for (outer = 1; outer < a.length; outer++) {...} The invariant is that all the elements to the left of outer are sorted with respect to one another For all i < outer, j < outer, if i < j then a[i] <= a[j] This does not mean they are all in their final correct place; the remaining array elements may need to be inserted When we increase outer, a[outer-1] becomes to its left; we must keep the invariant true by inserting a[outer-1] into its proper place This means: Finding the elements proper place Making room for the inserted element (by shifting over other elements) Inserting the element One step of insertion sort 3 4 7 12 14 14 20 21 33 38 10 55 9 23 28 16 sorted next to be inserted 3 4 7 55 9 23 28 16 10 temp 38 33 21 20 14 14 12 10 sorted less than 10 This one step takes O(n) time We must do it n times; hence, insertion sort is O(n 2 ) The idea of shellsort With insertion sort, each time we insert an element, other elements get nudged one step closer to where they ought to be What if we could move elements a much longer distance each time? We could move each element: A long distance A somewhat shorter distance A shorter distance still This approach is what makes shellsort so much faster than insertion sort Sorting nonconsecutive subarrays Consider just the red locations Suppose we do an insertion sort on just these numbers, as if they were the only ones in the array? Now consider just the yellow locations We do an insertion sort on just these numbers Now do the same for each additional group of numbers Here is an array to be sorted (numbers arent important) The resultant array is sorted within groups, but not overall Doing the 1-sort In the previous slide, we compared numbers that were spaced every 5 locations This is a 5-sort Ordinary insertion sort is just like this, only the numbers are spaced 1 apart We can think of this as a 1-sort Suppose, after doing the 5-sort, we do a 1-sort? In general, we would expect that each insertion would involve moving fewer numbers out of the way The array would end up completely sorted Diminishing gaps For a large array, we dont want to do a 5-sort; we want to do an N-sort, where N depends on the size of the array N is called the gap size, or interval size We may want to do several stages, reducing the gap size each time For example, on a 1000-element array, we may want to do a 364-sort, then a 121-sort, then a 40-sort, then a 13-sort, then a 4-sort, then a 1-sort Why these numbers? The Knuth gap sequence No one knows the optimal sequence of diminishing gaps This sequence is attributed to Donald E. Knuth: Start with h = 1 Repeatedly compute h = 3*h + 1 1, 4, 13, 40, 121, 364, 1093 Stop when h is larger than the size of the array and use as the first gap, the previous number (364) To get successive gap sizes, apply the inverse formula: h = (h 1) / 3 This sequence seems to work very well It turns out that just cutting the array size in half each time does not work out as well Analysis I You cut the size of the array by some fixed amount, n, each time Consequently, you have about log n stages Each stage takes O(n) time Hence, the algorithm takes O(n log n) time Right? Wrong! This analysis assumes that each stage actually moves elements closer to where they ought to be, by a fairly large amount What if all the red cells, for instance, contain the largest numbers in the array? None of them get much closer to where they should be In fact, if we just cut the array size in half each time, sometimes we get O(n 2 ) behavior! Analysis II So what is the real running time of shellsort? Nobody knows! Experiments suggest something like O(n 3/2 ) or O(n 7/6 ) Analysis isnt always easy! The End