04 Sorting Algorithms(1)
04 Sorting Algorithms(1)
Algorithm Fundamentals
Sorting Algorithms
Lectures prepared by: Dr. Manal Alharbi, and Dr. Areej Alsini
Outline
• Input:
• Output:
• A permutation (reordering) a1’, a2’, . . . , an’ of the input sequence such that
3
Some Definitions
• Internal Sort
• The data to be sorted is all stored in the computer’s main memory.
• External Sort
• Some of the data to be sorted might be stored in some external, slower, device.
• In Place Sort
• The amount of extra space required to sort the data is constant with the input
size.
4
Sorting Algorithms Review
• Insertion sort
• In-place: No additional data structure.
• 𝑂(𝑛2 )
• Selection sort
• In-place: No additional data structure.
• 𝑂(𝑛2 )
• Quick sort
• In-place: No additional data structure.
• 𝑂(𝑛2 )
• Merge sort
• Idea: Merging two Sorted Lists
• Bottom-Up Merge Sort
• Need additional data structure.
• 𝑂(𝑛𝑙𝑜𝑔𝑛)
Sorting Algorithms
Sorting Algorithms
• Insertion sort
• Selection sort
• Quick sort
• Randomized QuickSort
• Quick Select Sort Algorithm
Insertion Sort
𝑨𝒍𝒈𝒐𝒓𝒊𝒕𝒉𝒎: 𝑰𝑵𝑺𝑬𝑹𝑻𝑰𝑶𝑵𝑺𝑶𝑹𝑻
𝑰𝒏𝒑𝒖𝒕: 𝐴𝑛 𝑎𝑟𝑟𝑎𝑦 𝐴[1. . 𝑛] 𝑜𝑓 𝑛 𝑒𝑙𝑒𝑚𝑒𝑛𝑡𝑠.
𝑶𝒖𝒕𝒑𝒖𝒕: 𝐴[1. . 𝑛] 𝑠𝑜𝑟𝑡𝑒𝑑 𝑖𝑛 𝑛𝑜𝑛 − 𝑑𝑒𝑐𝑟𝑒𝑎𝑠𝑖𝑛𝑔 𝑜𝑟𝑑𝑒𝑟.
Idea:
Check each element from second position to the last whether they are
in their correct position (step 1). If the current element is not in the correct
position (step 4), shift to the right all the element until finding its correct
position (step 5)
insert sort example
6 2 11 7
• For i=1, (unsorted array)
j i
x= A[i] ➔ A[1]= 2→ x=2
j= i-1 ➔ j= 0
while (j ≥ 0) and (A[j] > x) 6 6 11 7
A[j + 1] = A[j] ➔ A[1] = A[0] j i
j=j–1 ➔ j= null ➔ end while loop
2 6 11 7
A[j + 1] = x ➔ A[0] = 2
• For i=2
x= A[i] ➔ A[2]= 11 → x=11 2 6 11 7
j= i-1 ➔ j= 1 j i
• For i=3
x= A[i] ➔ A[3]= 7→ x=7 2 6 11 7
j= i-1 ➔ j= 2 j i
𝒏(𝒏−𝟏)
• The maximum number of element comparisons is which occurs when the array
𝟐
is sorted in reverse order and all element are distinct
𝒏−𝟏 𝒏−𝟏 𝒏(𝒏−𝟏)
σ σ
(worst case) = 𝒊=𝟏 𝒏 − 𝒊 = 𝒊=𝟏 𝒊 =
𝟐
Selection sort
2 6 11 7
i,k j
• Repeat
Selection Sort – Analysis
• The inner for loop executes the size of the unsorted part minus 1 (from
1 to n-1), and in each iteration we make one key comparison.
➔ # of key comparisons = 1+2+...+n-1 = n*(n-1)/2
➔ So, Selection sort is O(n2)
• The best case, the worst case, and the average case of the selection sort
algorithm are same. ➔ all of them are O(n2)
• This means that the behavior of the selection sort algorithm does not depend
on the initial organization of data.
• Since O(n2) grows so rapidly, the selection sort algorithm is appropriate only
for small n.
Remarks on Selection Sort and Insertion
Sort algorithms:
• When we want to discuss about quick sort, we have to discuss partitioning first; after
that we use partitioning to do the quick sort.
• Assume we have an array A has the following inputs:
9 4 6 3 7 1 2 11 5
• For the sake of quick sort, we have to choose a pivot, we can choose the first element
“9” as a pivot or we can choose the last element “5” as a pivot, or we can use any
random element as a pivot (randomized quicksort); in our example we choose the last
element “5”.
• We have two pointers i, j, where initially i points to null and j points to the first
element in the array; Every time we increment j and check if the variable in index j is
greater than the pivot (e.g., 9>5) then increment j, otherwise (A[j] ≤ pivot) we will
increment i, and swap the values between i and j, and increment j as:
9 4 6 3 7 1 2 11 5
i=Null; j=0; A[pivot]= 5;
If (A[j] > A[pivot]) j++; i j pivot
In the last step, both j, and pivot points to the same element/location.
• Since the element in location j ≤ the element in pivot’s location, then we will increment the value
of index i and the we are going to do the swap between elements in indexes i and j;
• As we can see, all the elements in the left of “5” is less than or equal “5”, and all the elements in
the right of “5” is greater than or equal “5”.
Quick sort-
example 2
Quick sort Algorithm
• Best case:
• In best case we assume that the quick sort is done on the exact middle.
• T(n) = Time to sort the first part T(n/2) + Time to sort the second part T(n/2) + Time
to partition O(n)
▪ T(n) = 2 T(n/2) + O(n)
Back-Substitution:
𝑛 𝑛
𝑛 𝑛 𝑛
▪ 𝑻 = 2 2 2
+2 = 2𝑻 + 3 // replace n in eq 2 by
𝑛
22 22 23 22 2
𝒏 2 𝑛 𝑛 3 𝒏
▪ Substitute 3 in T :2 2𝑇 + +2𝑛 = 2 𝑇 + 𝒏 + 2n
𝟐𝟐 23 22 𝟐𝟑
3 𝒏
=2 𝑇 + 𝟑𝒏
𝟐𝟑
Back-Substitution cont.:
𝒏
• T 𝒏 =2𝑇 +𝒏
𝟐
𝒏 2 𝒏
• T =2 𝑇 + 𝟐𝒏
𝟐 𝟐𝟐
𝒏 𝒏
• T = 23 𝑇 + 𝟑𝒏
𝟐𝟐 𝟐𝟑
.
.
.
𝒏 𝑘 𝒏
T 𝒌 = 2 𝑇 𝒌 + 𝒌𝒏
𝟐 𝟐
• We want to reach base case ➔ T(1), thus, let 2𝑘 = n ➔𝒌 = log 2 𝑛
• Substitute k:
𝑘 𝒏 log2 𝑛 𝒏 𝒏
➔2 𝑇 + 𝒌𝒏 ➔ 2 𝑇 + (log 2 𝑛)𝒏 ➔ 𝑛 𝑇 + 𝑛log 2 𝑛
𝟐𝒌 𝟐log2 𝑛 𝒏
➔𝑛 𝑇 1 + 𝑛 log 𝑛 ➔ 𝑛 + 𝑛 log 𝑛 ➔ we can write: Ω (𝑛 log 𝑛) = best case
Remember that: 𝑛 < 𝑛 𝑙𝑜𝑛𝑔 ➔ we can write the run time as (𝑛 log 𝑛) instead of
(𝑛 + 𝑛 log 𝑛)
Worst case:
j
j
How can we improve the time? The solution is randomized
quick sort
• Let’s take the previous example and apply randomized quick sort
• We choose any random variable between p and r (e.g., 2)
• We swap the values between the random variable and the last element
• Then, we apply the quick sort.
• Thus, when we have items that are already sorted, we make them unsorted by
swapping the random number between p and r with the last element. Hence
whenever doing the partitioning, the partitioning will not be at the exact any
side (all left or all right the pivot) but the partition will be in the middle
somewhere.
• Instead of the above steps, quick sort can be converted to a randomized
algorithm by picking the pivot element randomly. In this case we can show
that the expected run time is 𝑂(𝑛𝑙𝑜𝑔𝑛) (where the expectation is computed in
the space of all possible outcomes for coin flips).
• Here, the analysis of the randomized algorithm is the same for deterministic
algorithm
Randomized QuickSort
𝑅𝑎𝑛𝑑𝑜𝑚𝑖𝑧𝑒𝑑𝑄𝑢𝑖𝑐𝑘𝑆𝑜𝑟𝑡 (𝐴, 𝑝, 𝑟)
{
𝑖𝑓 (𝑝 < 𝑟) // 𝑖𝑓 𝑎𝑡 𝑙𝑒𝑎𝑠𝑡 𝑤𝑒 ℎ𝑎𝑣𝑒 𝑚𝑜𝑟𝑒 𝑡ℎ𝑎𝑛 𝑜𝑛𝑒 𝑒𝑙𝑒𝑚𝑒𝑛𝑡
{
𝑖 = 𝑟𝑎𝑛𝑑𝑜𝑚(𝑝, 𝑟);
𝑠𝑤𝑎𝑝(𝐴[𝑖], 𝐴[𝑟])
𝑞 = 𝑝𝑎𝑟𝑡𝑖𝑡𝑖𝑜𝑛𝑖𝑛𝑔(𝐴, 𝑝, 𝑟);
𝑅𝑎𝑛𝑑𝑜𝑚𝑖𝑧𝑒𝑑𝑄𝑢𝑖𝑐𝑘𝑆𝑜𝑟𝑡(𝐴, 𝑝, 𝑞 − 1);
𝑅𝑎𝑛𝑑𝑜𝑚𝑖𝑧𝑒𝑑𝑄𝑢𝑖𝑐𝑘𝑆𝑜𝑟𝑡(𝐴, 𝑞 + 1, 𝑟);
}
}
• Special cases:
• When 𝑘 = 1, we are interested in finding the minimum of X.
• This can be done in 𝑂(𝑛) time.
2 3 1 9 7 6 10 5
• Can we do better?
• Do we need to sort all element in the array?
• We can devise an algorithm for selection that is similar to quick sort. This
algorithm is called quick select and works as follows:
Quick Select Algorithm
• Find the 4th smallest element (k=4)
QuickSelect (A,p,r,k)
2 3 1 9 7 6 10 5 { // if at least we have more than one element
if (p< r)
• We can devise an algorithm for selection that {
is similar to quick sort.
q=partitioning(A,p,r);
2 3 1 9 7 6 10 5 len_Left = (q-1)-p+1
p r if len_Left = k-1 then
Output q //q is the kth smallest
Left subarray Right subarray return
if len_Left ≥ k then
2 3 1 5 7 6 10 9 QuickSelect(A,p,q-1,k);
p q-1 q q+1 r else
QuickSelect(A,q+1,r,k-len_Left+1);
• Since, len_Left = 3, which is k-1; //remove the elements from the left
then the output ➔ 5 subarray and the pivot to get the correct
order in the right subarray.
}
• Here, we either sort elements in left
subarray or right subarray → one side }
Quick Select Algorithm
• Let T(n) be the run time of this algorithm on any input of size n.
• Then we have:
T(n) = n + max {T(|XL|),T(|XR|)}.
• One of the worst cases happens when one of the parts is empty on each
recursive call. In this case,
T(n) = n + T(n − 1) which solves to: T(n) = Θ(n2).
• One of the best cases is when both XL and XR are of nearly the same
size. In this case: T(n) = n + n/2 = Θ(n).
• We can also show that the average run time of quick select is O(n).
Problem Description:
• Given two lists (arrays) that are sorted in non-
Merging Two decreasing order, and
Sorted Lists • We need to merge them into one list sorted in non-
decreasing order.
Merging Two-Sorted Lists
Merging Two Sorted Lists
Merging Two Sorted Lists
Merging Two Sorted Lists
Merging Two Sorted Lists
Merging Two Sorted Lists
Merging Two Sorted Lists
Merging Two Sorted Lists
Algorithm MERGE
Algorithm: MERGE
Input: An array A[1..m] of elements and three indices p, q and r, with
1 ≤ p ≤ q <r ≤ m, such that both the sub-arrays A[p..q] and A[q + 1..r]
are sorted individually in non-decreasing order.
• Assume that the sizes of the two subarrays A[p, q] and A[q+1,r] to
be merged are n1 and n2 have n1+n2 = n elements:
• The least/minimum number of comparisons occurs if ……...……...
• The number of comparisons in this case ……...
• Assume that the sizes of the two subarrays A[p, q] and A[q+1,r] to
be merged are n1 and n2 have n1+n2 = n elements:
• The least/minimum number of comparisons occurs if each entry in the
smaller subarray (say of size n1) is less than all entries in the larger
subarray as observed in the previous example.
• The number of comparisons in this case n1.
1 2 3 4 5 6 7 8 9 10 12
1 2 4 5 7 8 9 12 3 6 10
2 5 8 9 1 4 7 12 3 6 10
2 5 8 9 4 12 1 7 3 6 10
5 2 9 8 4 12 7 1 3 6 10
5 2 9 8 4 12 7 1 3 6 10
Algorithm BOTTOMUPSORT
• 𝑠 = the size of sequences to be
Algorithm: BOTTOMUPSORT
merged.
Input: An array A[1..n] of n elements.
Output: A[1..n] sorted in nondecreasing order. • Initially, 𝑠 is set to 1, and is doubled
in each iteration of the outer while
1. t = 1
loop.
2. while t < n
3. s = t; t = 2s; i = 0 • 𝑖 + 1, 𝑖 + 𝑠 and 𝑖 + 𝑡 define the
4. while i + t ≤ n boundaries of the two sequences to
be merged.
5. MERGE(A, i + 1, i + s, i + t)
6. i=i+t • Step 8 is needed in the case when 𝑛
7. end while is not a multiple of t. In this case, if
the number of remaining elements,
8. if i + s < n then MERGE(A, i + 1, i+ s, n) which is 𝑛 − 𝑖, is greater than 𝑠, then
9. end while one more merge is applied on a
sequence of size 𝑠 and the remaining
elements.
an example of the working of the algorithm when
n is not a power of 2. The behavior of the 1. In the first iteration, s = 1 and t = 2. Five pairs of
1-element sequences are merged to produce
algorithm can be described as follows.
five 2-element sorted sequences. After the end
of the inner while loop, i + s = 10 + 1 ≮ n = 11,
and hence no more merging takes place.
2. In the second iteration, s = 2 and t = 4. Two pairs
of 2-element sequences are merged to produce
two 4-element sorted sequences. After the end
of the inner while loop, i + s = 8 + 2 < n = 11, and
hence one sequence of size s = 2 is merged with
the one remaining element to produce a 3-
element sorted sequence.
3. In the third iteration, s = 4 and t = 8. One pair of
4-element sequences are merged to produce
one 8-element sorted sequence. After the end
of the inner while loop, i + s = 8 + 4 ≮ n = 11 and
hence no more merging takes place.
4. In the fourth iteration, s = 8 and t = 16. Since i + t
0 = 0 + 16 ≰n = 11, the inner while loop is not
executed. Since i + s = 0 + 8 < n = 11, the
condition of the if statement is satisfied, and
hence one merge of 8-element and 3-element
sorted sequences takes place to produce a
sorted sequence of size 11.
5. Since now t = 16 > n, the condition of the outer
while loop is not satisfied, and consequently the
algorithm terminates.
Analyzing Algorithm BOTTOMUPSORT
• With no loss of generality, assume that the size of the array, n, is a power of 2. in this case the outer while loop is
executed k= log n times, once for each level in the sorting tree
• In the first iteration:
• we have n sequence of one element each are merged to in pairs. The number of comparisons needed to merge
𝑛
in each pair is one. The number of comparisons is
2
𝑛 𝑛
➔ Frist merge consecutive pairs of elements to get sorted sequence of size 2. if there is one remaining
2 2
element, then it is passed to the next iteration (like element 7 in previous example).
Observation : The total number of element comparison performed by algorithm to sort an array
𝑛𝑙𝑜𝑔𝑛
of n element where n is a power of 2 is between and 𝑛𝑙𝑜𝑔𝑛 – 𝑛 + 1
2
Run Time
• Time is an extremely precious resource to be investigated in the analysis of
algorithms. Consider the following example:
• Example 1:
• Maximum number of element comparisons performed by Algorithm BOTTOMUPSORT
when 𝑛 is a power of 2 is 𝑛𝑙𝑜𝑔𝑛 – 𝑛 +1
𝑛 𝑛−1
• number of element comparisons performed by Algorithm SELECTIONSORT is
2
• Results:
• Algorithm BOTTOMUPSORT takes at most 10−6 128 × 7 − 128 + 1 = 0.0008 seconds
• The objective is not just to look at the analysis of the algorithms from just time points of
view, it is required to develops the subject on a solid reasoning, which is independent of
various factors like time, machine, compiler, etc.
• The running time of an algorithm is defended as the time needed by an algorithm in order
to deliver its output when presented with legal input. It is important to note that the
running time of an algorithm is measured in terms of elementary operations involved with
the input elements.
Can we do better?
• Bucket sort
• Radix Sort
• we will show that we can sort n integers in the range [0,nc −1] in O(n)
time, for any constant c. The case of c = 2 is very common in graph
algorithms and computational geometry algorithms.
• The idea of radix sorting is to sort the given keys with respect to some
number of bits at a time.
• Key idea: sort on the “least significant digit” first and on the remaining digits in
sequential order. The sorting method used to sort each digit must be “stable”.
An Example
Input After sorting After sorting After sorting
on LSD on middle digit on MSD
• Moral: The idea of radix sorting works if we use a stable sorting algorithm in every phase of sorting