CSCI2100 07 Sorting
CSCI2100 07 Sorting
Sorting
Irwin King
[email protected]
http://www.cse.cuhk.edu.hk/~king
Department of Computer Science & Engineering
The Chinese University of Hong Kong
CSC2100 Data Structures, The Chinese University of Hong Kong, Irwin King, All rights reserved.
Introduction
Integers
Use internal memory
CSC2100 Data Structures, The Chinese University of Hong Kong, Irwin King, All rights reserved.
Introduction
CSC2100 Data Structures, The Chinese University of Hong Kong, Irwin King, All rights reserved.
Introduction
Bubble Sort
Insertion Sort
Selection
Quick Sort
Heap Sort
Shell Sort
Merge Sort
Radix Sort
CSC2100 Data Structures, The Chinese University of Hong Kong, Irwin King, All rights reserved.
Introduction
Types of Sorting
Single-pass
Multiple-pass
Operations in Sorting
Permutation
Inversion (Swap)
Comparison
CSC2100 Data Structures, The Chinese University of Hong Kong, Irwin King, All rights reserved.
Permutation
CSC2100 Data Structures, The Chinese University of Hong Kong, Irwin King, All rights reserved.
K-Permutation
CSC2100 Data Structures, The Chinese University of Hong Kong, Irwin King, All rights reserved.
Inversion
For example, the input list 34, 8, 64, 51, 32, 21 has nine
inversions, namely (34,8), (34,32), (34,21), (64,51), (64,32),
(64,21), (51,32), (51,21) and (32,21).
CSC2100 Data Structures, The Chinese University of Hong Kong, Irwin King, All rights reserved.
Preliminaries
CSC2100 Data Structures, The Chinese University of Hong Kong, Irwin King, All rights reserved.
Bubble Sort
In this pass, the largest key in the list will have bubbled to
the end, but the earlier keys may still be out of order.
CSC2100 Data Structures, The Chinese University of Hong Kong, Irwin King, All rights reserved.
Bubble Sort
CSC2100 Data Structures, The Chinese University of Hong Kong, Irwin King, All rights reserved.
Example
Original
34 8 64 51 32 21
Number of Exchange
--------------------------------------------------------------------------After p = 1
8 34 21 64 51 32
After p = 2
8 21 34 32 64 51
After p = 3
8 21 32 34 51 64
After p = 4
8 21 32 34 51 64
After p = 5
8 21 32 34 51 64
After p = 6
8 21 32 34 51 64
CSC2100 Data Structures, The Chinese University of Hong Kong, Irwin King, All rights reserved.
Bubble Sort
CSC2100 Data Structures, The Chinese University of Hong Kong, Irwin King, All rights reserved.
CSC2100 Data Structures, The Chinese University of Hong Kong, Irwin King, All rights reserved.
An Improved Version in C
bubble(x,n)
int x[], n;
{
int hold, j, pass;
int switched = TRUE;
for (pass = 0; pass < n - 1 && switched == TRUE; pass++) {
switched = FALSE;
for (j = 0; j < n-pass-1; j++)
if (x[j] > x[j+1]) {
switched = TRUE;
hold = x[j];
x[j] = x[j+1];
x[j+1] = hold;
} }
}
CSC2100 Data Structures, The Chinese University of Hong Kong, Irwin King, All rights reserved.
Bubble Sort
CSC2100 Data Structures, The Chinese University of Hong Kong, Irwin King, All rights reserved.
Bubble Sort
CSC2100 Data Structures, The Chinese University of Hong Kong, Irwin King, All rights reserved.
Bubble Sort
Advantage
CSC2100 Data Structures, The Chinese University of Hong Kong, Irwin King, All rights reserved.
Insertion Sort
CSC2100 Data Structures, The Chinese University of Hong Kong, Irwin King, All rights reserved.
Insertion Example
Original
34 8 64 51 32 21
Positions Moved
--------------------------------------------------------------------After p = 1
34 8 64 51 32 21
After p = 2
8 34 64 51 32 21
After p = 3
8 34 64 51 32 21
After p = 4
8 34 51 64 32 21
After p = 5
8 32 34 51 64 21
After p = 6
8 21 32 34 51 64
CSC2100 Data Structures, The Chinese University of Hong Kong, Irwin King, All rights reserved.
Insertion Sort
insertsort(x,n)
int x[], n;
{ int i, k, y;
for (k=1; k<n; k++) {
y=x[k];
/* move down 1 position all elements greater than y */
for (i = k-1; i>=0 && y < x[i]; i--)
x[i+1] = x[i];
/* insert y at proper position */
x[i+1] = y;
}
}
CSC2100 Data Structures, The Chinese University of Hong Kong, Irwin King, All rights reserved.
Insertion Sort
CSC2100 Data Structures, The Chinese University of Hong Kong, Irwin King, All rights reserved.
CSC2100 Data Structures, The Chinese University of Hong Kong, Irwin King, All rights reserved.
The close the file is to sorted order, the more efficient the
simple insertion sort becomes.
CSC2100 Data Structures, The Chinese University of Hong Kong, Irwin King, All rights reserved.
CSC2100 Data Structures, The Chinese University of Hong Kong, Irwin King, All rights reserved.
CSC2100 Data Structures, The Chinese University of Hong Kong, Irwin King, All rights reserved.
Consider any pair of two numbers in the list (x,y), with y > x.
CSC2100 Data Structures, The Chinese University of Hong Kong, Irwin King, All rights reserved.
CSC2100 Data Structures, The Chinese University of Hong Kong, Irwin King, All rights reserved.
Selection Sort
CSC2100 Data Structures, The Chinese University of Hong Kong, Irwin King, All rights reserved.
Selection Sort
Original
34 8 64 51 32 21
Positions Moved
---------------------------------------------------------------------After p = 1
34 8 64 51 32 21
After p = 2
8 34 64 51 32 21
After p = 3
8 21 64 51 32 34
After p = 4
8 21 32 51 64 34
After p = 5
8 21 32 34 64 51
After p = 6
8 21 32 34 51 64
CSC2100 Data Structures, The Chinese University of Hong Kong, Irwin King, All rights reserved.
Selection Sort
As the first stage, one scans the list to find the entry that
comes last in the order.
CSC2100 Data Structures, The Chinese University of Hong Kong, Irwin King, All rights reserved.
Comparison
Selection
Insertion (average)
Assignments
of entries
3.0n + O(1)
0.25n2 + O(n)
0.5n2 + O(n)
0.25n2 + O(n)
Comparisons
of keys
CSC2100 Data Structures, The Chinese University of Hong Kong, Irwin King, All rights reserved.
selectsort(x,n)
int x[], n;
{ int i, indx, j, large;
for (i=n-1; i>0; i--) {
/* place the largest number of x[0] through */
/* x[i] into large and its index into indx */
large = x[0]; indx = 0;
for (j=1; j <= i; j++)
if (x[j] > large) {
large = x[j]; indx = j;}
x[indx] = x[i];
x[i] = large;}}
CSC2100 Data Structures, The Chinese University of Hong Kong, Irwin King, All rights reserved.
CSC2100 Data Structures, The Chinese University of Hong Kong, Irwin King, All rights reserved.
CSC2100 Data Structures, The Chinese University of Hong Kong, Irwin King, All rights reserved.
Shell Sort
CSC2100 Data Structures, The Chinese University of Hong Kong, Irwin King, All rights reserved.
Shell Sort
CSC2100 Data Structures, The Chinese University of Hong Kong, Irwin King, All rights reserved.
Example
Original
81 94 11 96 12 35 17 95 28 58 41 75 15
-----------------------------------------------------------------------------After 5-sort 35 17 11 28 12 41 75 15 96 58 81 94 95
After 3-sort 28 12 11 35 15 41 58 17 94 75 81 96 95
After 1-sort 11 12 15 17 28 35 41 58 75 81 94 95 96
CSC2100 Data Structures, The Chinese University of Hong Kong, Irwin King, All rights reserved.
shellsort(x,n,incrmnts, numinc)
int x[], n, incrmnts[], numinc;
{ int incr, j, k, span, y;
for (incr = 0; incr < numinc; incr++) {
/* span is the size of the increment */
span = incrmnts[incr];
for (j = span; j < n; j++) {
/* Insert element x[j] into its proper */
/* position within its subfile */
y = x[j];
for (k = j-span; k >= 0 && y < x[k]; k-= span)
x[k+span] = x[k]; x[k+span] = y;
}}}
CSC2100 Data Structures, The Chinese University of Hong Kong, Irwin King, All rights reserved.
Analysis
CSC2100 Data Structures, The Chinese University of Hong Kong, Irwin King, All rights reserved.
Analysis
CSC2100 Data Structures, The Chinese University of Hong Kong, Irwin King, All rights reserved.
Analysis
It has been shown that the order of the Shell sort can be
approximated by O(n (log n)2) if an appropriate
sequence of increments is used.
CSC2100 Data Structures, The Chinese University of Hong Kong, Irwin King, All rights reserved.
Analysis
CSC2100 Data Structures, The Chinese University of Hong Kong, Irwin King, All rights reserved.
Analysis
CSC2100 Data Structures, The Chinese University of Hong Kong, Irwin King, All rights reserved.
Heapsort
CSC2100 Data Structures, The Chinese University of Hong Kong, Irwin King, All rights reserved.
Basic Idea
CSC2100 Data Structures, The Chinese University of Hong Kong, Irwin King, All rights reserved.
Cleaver Modification
Thus the cell that was last in the heap can be used to store
the element that was just deleted.
CSC2100 Data Structures, The Chinese University of Hong Kong, Irwin King, All rights reserved.
Example
CSC2100 Data Structures, The Chinese University of Hong Kong, Irwin King, All rights reserved.
Merge Sort
CSC2100 Data Structures, The Chinese University of Hong Kong, Irwin King, All rights reserved.
Merge Sort
CSC2100 Data Structures, The Chinese University of Hong Kong, Irwin King, All rights reserved.
Merge Sort
CSC2100 Data Structures, The Chinese University of Hong Kong, Irwin King, All rights reserved.
Merge Sort
CSC2100 Data Structures, The Chinese University of Hong Kong, Irwin King, All rights reserved.
Merge Sort
CSC2100 Data Structures, The Chinese University of Hong Kong, Irwin King, All rights reserved.
CSC2100 Data Structures, The Chinese University of Hong Kong, Irwin King, All rights reserved.
Analysis
It is clear from the tree that the total lengths of the lists
on each level is precisely n, the total number of entries.
Analysis
T (1) = 1
T (n) = 2T(n / 2) + n
T (n) T(n/ 2)
=
+1
n
n/ 2
T (n / 2) T(n/ 4)
=
+1
n/2
n/ 4
T (n / 4) T(n /8)
=
+1
n/4
n/8
M
T (2) T(1)
=
+1
2
1
T (n) T(1)
=
+ log n
n
1
T (n) = nlog n + n = O(nlog n)
CSC2100 Data Structures, The Chinese University of Hong Kong, Irwin King, All rights reserved.
QuickSort
CSC2100 Data Structures, The Chinese University of Hong Kong, Irwin King, All rights reserved.
Quicksort
CSC2100 Data Structures, The Chinese University of Hong Kong, Irwin King, All rights reserved.
Quicksort
CSC2100 Data Structures, The Chinese University of Hong Kong, Irwin King, All rights reserved.
Example
R1 R2 R3 R4 R5 R6 R7 R8 R9 R10 m
[26
[11
[1
1
1
1
1
1
10
5
2
5
10
8
10
5
5
5]
5
5
5
5
5
37 1 61 11 59 15
19 1 15] 26 [59 61
11 [19 15] 26 [59 61
11 [19 15] 26 [59 61
11 15 19 26 [59 61
11 15 19 26 [48 37]
11 15 19 26 37 48
11 15 19 26 37 48
48
48
48
48
48
59
59
59
19 1
37] 1
37] 1
37] 4
37] 7
[61] 7
[61] 10
61
CSC2100 Data Structures, The Chinese University of Hong Kong, Irwin King, All rights reserved.
Quicksort Algorithm
QUICKSORT(A,p,r)
1 if p < r
2
then q PARTITION(A,p,r)
QUICKSORT(A,p,q)
QUICKSORT(A,q + 1,r)
CSC2100 Data Structures, The Chinese University of Hong Kong, Irwin King, All rights reserved.
Partition Algorithm
partition(x,lb,ub,pj)
int x[], lb, ub, *pj;
{ int a, down, temp, up;
a = x[lb]; up = ub; down = lb;
while (down < up) {
while (x[down] <= a && down < ub)
down++;
while (x[up] > a) up--;
if (down < up) {
temp = x[down]; x[down] = x[up];
x[up] = temp;}}
x[lb] = x[up]; x[up] = a; *pj = up;}
CSC2100 Data Structures, The Chinese University of Hong Kong, Irwin King, All rights reserved.
Partition Operation
Goal
<p
Pivot location
low
>=p
high
Loop Invariant
p
low
<p
>=p
Last small
?
i
CSC2100 Data Structures, The Chinese University of Hong Kong, Irwin King, All rights reserved.
Partition Operation
<p
>=p
swap
Last small
<p ?
i
<p
>=p
Last small
?
i
CSC2100 Data Structures, The Chinese University of Hong Kong, Irwin King, All rights reserved.
CSC2100 Data Structures, The Chinese University of Hong Kong, Irwin King, All rights reserved.
Analysis
So for n 2, we have
C(n,p) = (n-1) + C(p-1) + C(n-p).
CSC2100 Data Structures, The Chinese University of Hong Kong, Irwin King, All rights reserved.
Analysis
CSC2100 Data Structures, The Chinese University of Hong Kong, Irwin King, All rights reserved.
Analysis
Harmonic Numbers
Harmonic Numbers
CSC2100 Data Structures, The Chinese University of Hong Kong, Irwin King, All rights reserved.
CSC2100 Data Structures, The Chinese University of Hong Kong, Irwin King, All rights reserved.
Bucket/Radix/Postman Sort
CSC2100 Data Structures, The Chinese University of Hong Kong, Irwin King, All rights reserved.
Analysis
If the keys are long but there are relatively few of them,
then k is large and n relatively small, and other methods
(such as mergesort) will outperform radix sort.
CSC2100 Data Structures, The Chinese University of Hong Kong, Irwin King, All rights reserved.
CSC2100 Data Structures, The Chinese University of Hong Kong, Irwin King, All rights reserved.
CSC2100 Data Structures, The Chinese University of Hong Kong, Irwin King, All rights reserved.
Stability
CSC2100 Data Structures, The Chinese University of Hong Kong, Irwin King, All rights reserved.
Stability
CSC2100 Data Structures, The Chinese University of Hong Kong, Irwin King, All rights reserved.
Example
CSC2100 Data Structures, The Chinese University of Hong Kong, Irwin King, All rights reserved.